Downhole Camera Runs Validate the Capability of Machine Learning Models to Accurately Predict Perforation Entry Hole Diameter

Nashed, Samuel; Lnu, Srijan; Guezei, Abdelali; Ejehu, Oluchi; Moghanloo, Rouzbeh

doi:10.3390/en17225558

Open AccessArticle

Downhole Camera Runs Validate the Capability of Machine Learning Models to Accurately Predict Perforation Entry Hole Diameter

by

Samuel Nashed

^*

,

Srijan Lnu

,

Abdelali Guezei

,

Oluchi Ejehu

and

Rouzbeh Moghanloo

Mewbourne School of Petroleum and Geological Engineering, Mewbourne College of Earth and Energy, The University of Oklahoma, Norman, OK 73019, USA

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(22), 5558; https://doi.org/10.3390/en17225558

Submission received: 11 October 2024 / Revised: 29 October 2024 / Accepted: 4 November 2024 / Published: 7 November 2024

(This article belongs to the Section H: Geo-Energy)

Download

Browse Figures

Versions Notes

Abstract

In the field of oil and gas well perforation, it is imperative to accurately forecast the casing entry hole diameter under full downhole conditions. Precise prediction of the casing entry hole diameter enhances the design of both conventional and limited entry hydraulic fracturing, mitigates the risk of proppant screenout, reduces skin factors attributable to perforation, guarantees the presence of sufficient flow areas for the effective pumping of cement during a squeeze operation, and reduces issues related to sand production. Implementing machine learning and deep learning models yields immediate and precise estimations of entry hole diameter, thereby facilitating the attainment of these objectives. The principal aim of this research is to develop sophisticated machine learning-based models proficient in predicting entry hole diameter under full downhole conditions. Ten machine learning and deep learning models have been developed utilizing readily available parameters routinely gathered during perforation operations, including perforation depth, rock density, shot phasing, shot density, fracture gradient, reservoir unconfined compressive strength, casing elastic limit, casing nominal weight, casing outer diameter, and gun diameter as input variables. These models are trained by utilizing actual casing entry hole diameter data acquired from deployed downhole cameras, which serve as the output for the X’ models. A comprehensive dataset from 53 wells has been utilized to meticulously develop and fine-tune various machine learning algorithms. These include Gradient Boosting, Linear Regression, Stochastic Gradient Descent, AdaBoost, Decision Trees, Random Forest, K-Nearest Neighbor, neural network, and Support Vector Machines. The results of the most effective machine learning models, specifically Gradient Boosting, Random Forest, AdaBoost, neural network (L-BFGS), and neural network (Adam), reveal exceptionally low values of mean absolute percent error (MAPE), root mean square error (RMSE), and mean squared error (MSE) in comparison to actual measurements of entry hole diameter. The recorded MAPE values are 4.6%, 4.4%, 4.7%, 4.9%, and 6.3%, with corresponding RMSE values of 0.057, 0.057, 0.058, 0.065, and 0.089, and MSE values of 0.003, 0.003, 0.003, 0.004, and 0.008, respectively. These low MAPE, RMSE, and MSE values verify the remarkably high accuracy of the generated models. This paper offers novel insights by demonstrating the improvements achieved in ongoing perforation operations through the application of a machine learning model for predicting entry hole diameter. The utilization of machine learning models presents a more accurate, expedient, real-time, and economically viable alternative to empirical models and deployed downhole cameras. Additionally, these machine learning models excel in accommodating a broad spectrum of guns, well completions, and reservoir parameters, a challenge that a singular empirical model struggled to address.

Keywords:

machine learning; artificial intelligence; perforation entry hole diameter; gun; well completion; neural network

1. Introduction

1.1. The Importance of Predicting Perforation Entry Hole Diameter

Explosive perforation has constituted a prevailing technique for establishing communication between the reservoir and the wellbore for over seven decades [1]. The impact of the casing entry hole diameter (EHD) on well performance has been extensively analyzed through various significant studies and has attracted substantial academic interest over many years [2,3,4,5,6]. Smaller EHD will result in increased perforation friction, which will subsequently elevate the perforation skin and injection pressure at the surface [7,8]. Additionally, it will hinder proppant placement into the hydraulic fracture network, ultimately leading to premature proppant screenout or diminished cumulative production of oil and gas over time [9,10]. The ability to precisely predict EHD under comprehensive downhole conditions has become an essential element in today’s world, particularly with complex multi-stage hydraulic fracturing treatments [11]. This forecasting enhances the design of hydraulic fracturing [12]; the identification of the optimal EHD and the associated perforation friction facilitates superior distribution of proppant and fracturing fluid across all clusters within the well [13]. Perforation friction pressure serves as an indicator of the pressure differential between the wellbore and the reservoir; thus, maintaining this pressure at a sufficiently elevated level is pivotal in promoting uniform fracture treatment distribution across all clusters of a given stage throughout the treatment process when designing and executing limited entry fracture operations [14]. Furthermore, appropriately sized perforations diminish the likelihood of proppant screenout attributable to excessive perforation friction [15]. The predictive Equation (1) for pressure drop through an orifice is founded upon the Bernoulli theorem and is utilized to estimate perforation friction pressure. This equation elucidates that the pressure drop across perforations is contingent upon the mass flow rate divided by the effective wellbore outflow area. Equation (1) also indicates that perforation friction pressure is profoundly influenced by variations in perforation diameter. Equation (2) is derived from Equation (1) and serves as its equivalent when perforation area is substituted for diameter. Equation (2) is anticipated to yield greater accuracy when the geometry of the perforation is non-circular [16]. Several empirical correlations have been developed for perforation friction pressure, such as those mentioned in a 1997 paper by A.M. El-Rabaa [17]. The accuracy of such correlations can be limited by factors such as perforation erosion during proppant displacement [18]. This erosion has a huge impact on EHD and perforation friction, as a study by Wu and Sharma et al. (2016) highlights [19]. Moreover, other factors like back pressure in the wellbore and the idealized simple cylinder shape assumed in these models can also influence the EHD and perforation friction [7,20]. Additionally, meticulous perforation design can support well integrity, mitigate skin factors due to perforation geometry and damage effects, ensure the presence of adequate flow area (and flow velocity) requisite for the effective pumping of cement during a squeeze operation, and reduce issues related to sand production [21]. Ultimately, optimal perforation dimensions contribute to an enhanced recovery factor, thereby maximizing the economic returns derived from the reservoir [22].

Δ P_{p} = \frac{0.2369 \times Q^{2} \times ρ}{C_{d}^{2} \times N^{2} \times D^{4}}

(1)

{Δ P}_{p} = \frac{0.1461 {\times Q}^{2} \times ρ}{A^{2} \times N^{2} \times C_{d}^{2}}

(2)

where ΔPp = pressure drop across perforations, psi; Q = injection rate, bbl/min; ρ (rho) = fluid/slurry density, lb/gal; Cd = discharge coefficient; N = number of perforations; and D = perforation EHD in casing, in. A = Perforation Area, in.² (computed from the pixel count of calibrated image).

1.2. Traditional Prediction Methods

EHD can be obtained either via direct measurement or through an indirect computation informed by empirical models. In the former case, it is necessary to deploy a downhole camera in the wellbore to capture high-resolution images of the actual perforation, which are subsequently analyzed to ascertain the precise EHD under downhole conditions [23]. These actual images of the perforations are essential for validating the empirical prediction models and enhancing our comprehension of the perforation process under full downhole conditions. Furthermore, they can highlight the shortcomings of the API 19B standard lab tests and machine learning prediction models. A total of 279 pre- and 595 post-fracture treatment perforations were captured in a study conducted by Horton (2021), which demonstrated how lab tests and empirical prediction models can be misleading and how much further we must advance our understanding of the perforation process in real-world scenarios [24]. An advanced downhole camera was utilized in this study, capable of acquiring high-resolution color video footage in both lateral and vertical view orientations. While real downhole images represent the most precise methodology, this approach is characterized by high costs and significant time investment. In contrast, the latter approach requires the application of empirical models and correlations. Researchers in the oil and gas sector usually use the data from the API 19B Section 1 test to develop models and correlations that can predict perforation length and EHD under reservoir conditions [25]. Standardized certification evaluations of shaped charge perforators are conducted as specified in API RP-19B Section 1, which mandates the use of water as the wellbore fluid, a single layer of steel casing, and a sufficiently dense layer of concrete to encapsulate the entire penetration depth of the charge [26]. This standardized testing protocol fails to yield substantial insights into perforator performance under more complex downhole conditions, given that the actual EHD under downhole scenarios is far outside the scope of the published API RP-19B data [8,27]. Conversely, the objective of API RP-19B Section 2 (Figure 1 and Figure 2) is to generate shaped charge penetration and EHD performance data within stressed natural rock, which more accurately reflects a downhole environment compared to unstressed concrete (i.e., API RP-19B Section 1) [28]. While Section 2 testing attempts to overcome the issues of Section 1, several studies show that it remains challenging to represent downhole conditions like actual reservoir core with high temperature and pressure, along with under/overbalance perforation [26,28]. Therefore, the results may not always align with actual field measurements [29]. As a result, these empirical models resulted in high uncertainties, which makes the evaluation of perforation design a huge challenge [30]. Consequently, this indirect empirical model is rendered impractical due to its substantial data requirements, frequent calibration necessities, and the inherently time-intensive nature of its application [31].

1.3. Machine Learning Models for Predicting EHD

A rapid prediction of EHD across a variety of gun systems and completion scenarios with acceptable accuracy is imperative to address this gap within the oil and gas sector and to maintain competitiveness in today’s dynamic marketplace. Methodologies rooted in machine learning (ML) and deep learning (DL) have attracted considerable interest within the energy sector, with ML and data-driven modeling frequently employed interchangeably due to their inherent interconnection [33,34,35]. Data-driven models capitalize on readily available field data to establish statistical correlations between input variables and the variables of interest [36,37,38]. Advanced analysis such as Fourier analysis significantly enhances the accuracy of the machine learning models and makes them more attractive for the energy sector over time [39,40,41]. Machine learning applications have gained significant attention across various research fields, especially in production and reservoir engineering due to their superior efficiency when compared to other options [42,43,44]. However, developing a reliable ML model poses difficulties due to the multitude of approaches and processes available to tackle a particular issue, starting with data preprocessing, progressing through model training and optimization, and concluding with the validation and testing phases [45,46]. Concerning ML applications related to the estimation of EHD, there have been few instances and practical implementations of ML and DL techniques that have been introduced. These examples shed light on the challenges and promising research directions in this domain, emphasizing the urgent need for groundbreaking machine learning and data mining techniques to promote sustainable advancement. A research study carried out by Keshavarzi et al. in 2010 aimed at forecasting perforation length using an artificial neural network model demonstrated exceptional prediction accuracy, achieving a correlation coefficient of 0.98, which underscored the significant capability of machine learning models to estimate perforation parameters in downhole conditions [47]. Although data-driven models are regarded as the optimal strategy for estimating EHD, insufficient attention has been afforded to the data-driven models. This paper aims to bridge this deficiency by developing ten advanced machine learning models trained by actual field data to predict perforation EHD under downhole conditions. The utilization of machine learning models presents a more accurate, expedient, real-time, and economically viable alternative to empirical models and deployed downhole cameras. These ML models possess the capability to be effortlessly integrated into perforation simulation software for instantaneous estimation of EHD. Given its computational simplicity and implementation ease, which does not necessitate regular calibration or incur high operational costs, this solution is characterized as innovative and efficient. Additionally, these machine learning models excel in accommodating a broad spectrum of guns, well completions, and reservoir parameters, a challenge that a singular empirical model struggled to address. Consequently, production and reservoir engineers may utilize it as a significant tool for optimizing hydraulic fracturing designs, perforation skin values, well integrity, cement squeeze operations, and sand production strategies.

2. Methodology

The research methodology employed in this study is meticulously organized around five essential stages (Figure 3). The deliberate design of each phase ensures a methodical progression, with all stages working together to achieve the overall research goal.

2.1. Data Collection

In Egypt’s Western Desert, data were gathered from 53 wells situated in different oil fields and then combined. This dataset comprised EHD, gun depth (D), rock density (RD), shot phasing (SP), shot density (SD), fracture gradient (FG), reservoir unconfined compressive strength (UCS), casing elastic limit (CEL), casing nominal weight (CWT), casing outer diameter (COD), and gun diameter (GD). These synthesized data were utilized for the construction and validation of the ML models, in addition to comparing the results with authentic EHD obtained via a downhole camera. Downhole camera runs were executed subsequent to the perforation operations to facilitate data collection and optimization objectives. In order to initiate this research, a total of 1716 data points derived from actual field measurements were collected. It is essential to underscore that the datasets were acquired from multiple sources, each of which is characterized by unique formats and varying update frequencies. The datasets comprised an array of parameters, as specified in Table 1. It is essential to emphasize the broad representation of a varied array of reservoir, completion, and gun characteristics. A generalized model was developed owing to the extensive variety of parameters involved. The incorporation of data related to downhole camera outputs, gun specifications, reservoir characteristics, and casing details into every record for training and testing purposes is critically important. This approach not only enhanced the model’s adaptability but also improved its overall accuracy and reliability in addressing the complex challenges posed by the wide array of parameters.

Figure 4 illustrates the pair plot of the comprehensive dataset employed for the purposes of ML predictive modeling. The pair plot functions as an effective tool for analyzing the distributions and interrelationships within a dataset. It serves as an invaluable instrument for acquiring a comprehensive perspective of the dataset through a singular visual representation. The off-diagonal elements provide insights into the correlations among various variables, whereas the diagonal elements yield distributions regarding individual variables.

Figure 5 illustrates violin plots of the comprehensive dataset employed for the purposes of EHD predictive modeling. Violin plots are a great way to visualize the distribution of data across different categories. The width of each violin represents the frequency of data points at that value. The white dot in the center represents the median. The thick black bar in the center represents the interquartile range. The thin black lines extend to show the rest of the distribution, except for points that are considered outliers. From these plots, most parameters show either normal distributions or distinct multimodal patterns. Some parameters (like shot phasing and gun diameter) show discrete values while others are continuous. Very few outliers beyond the main distributions. Most distributions are in the range of 0.2 to 0.6.

2.2. Feature Ranking

ML models acquire insights from the datasets presented to them. The predictive effectiveness of these models is considerably dependent on the caliber and interrelation of input features with the target variable. Thus, comprehending the significance of features concerning the correlation coefficient is essential. In this study, the correlation coefficient (R) was calculated utilizing two distinct methodologies, specifically Pearson’s R and Spearman R. The R-value consistently ranges from “1” to “−1”, where the latter denotes an inverse correlation between two variables, while the former indicates a direct relationship among the variables. A value approaching “0” implies either a weak or nonexistent relationship among the variables. The definitions for both criteria are articulated by Equations (3) and (4).

ρ_{Pearson} = \frac{n \sum x y - (\sum x) (\sum y)}{\sqrt{(n (\sum x^{2}) - (\sum x^{2})) (n (\sum y^{2}) - (\sum y^{2})}}

(3)

where x and y are two variables, and n is the total number of samples.

Spearman R assesses the linear and nonlinear relationship between two variables. A perfect monotone function of one variable on the other results in an R-value of either +1 or −1.

ρ_{spearman} = ρ_{pearson} \frac{cov (x, y)}{γ_{x} γ_{y}}

(4)

where cov (x, y) is the covariance of the rank variables, and γ_xγ_y are the SDs of the rank variables.

Figure 6 illustrates the significance of input parameters, particularly D, RD, SP, SD, FG, UCS, CEL, CWT, COD, and GD in relation to EHD. Notably, GD exhibits a robust positive correlation with EHD, while the other parameters present moderate correlations.

Figure 7 illustrates the heat map that shows the correlation coefficient calculated using Pearson’s criteria for the input parameters related to EHD. This visualization aims to clarify the collinearity that exists among the input parameters.

2.3. Data Preprocessing

Through rigorous examination and verification, erroneous data points and compromised downhole camera imagery were systematically eliminated from the collected dataset. Subsequently, the absent data were addressed either by the complete removal of the respective data rows or by the application of statistical techniques and specialized knowledge to fill the resultant gaps. The mean imputation technique, which involves substituting absent values with the average of the available values for that variable, is employed to address the missing data regarding fracture gradient and unconfined compressive strength. Meanwhile, expert knowledge acquired from offset well data, such as wellbore logs, core analyses, and hydraulic fracturing treatment information, plays a crucial role in verifying the precision of these estimates. The subsequent stage in the database processing involved the removal of any outliers or thorough documentation, followed by the consolidation and normalization of the dataset. Normalization represents a data preprocessing strategy designed to convert numerical values into a uniform scale. The primary aim of normalization is to augment the effectiveness and accuracy of machine learning algorithms that demonstrate sensitivity to the scale of input variables, as it contributes to alleviating the impact of outliers or ensures that the data adhere to a Gaussian distribution. A diverse range of normalization methods are available in data preprocessing, with one of the most commonly utilized being the min-max scaler, which is a normalization technique that rescales numerical data to a specified range between 0 and 1. This methodology involves subtracting the minimum value of the dataset from each individual data point and subsequently dividing the resultant value by the range of the dataset. The minimum value of the dataset is adjusted to 0, while the maximum value is recalibrated to 1. This technique effectively rescales the data within the defined bounds of 0 to 1.

The formula for the min-max scaler technique is as follows:

X_{Normalized} = \frac{X - X_{Minimum}}{X_{Maximum} - X_{Minimum}}

(5)

where X is the input data, X_Normalized is the output that has been normalized, X_Minimum is the minimum value of the input data, and X_Maximum is the maximum value of the input data.

2.4. Models Structure

Ten machine learning (ML) models were created using the Python 3.10.12 programming language. These models include a wide range of techniques, including Gradient Boosting (GB), Support Vector Machines (SVMs), AdaBoost, K-Nearest Neighbor (KNN), Linear Regression (LR), Decision Trees (DTs), Random Forest (RF), neural networks with L-BFGS and Adam optimizers, and Stochastic Gradient Descent (SGD). This diverse selection of methodologies was chosen to strengthen the study’s conclusions and increase their broader applicability. This section provides a comprehensive account of each model, emphasizing their distinctive characteristics and algorithmic frameworks (Table 2). The models under discussion encompass a spectrum of methodologies, ranging from ensemble learning to regression and non-parametric supervised learning. The Pythagorean Forest (Figure 8) exemplifies the 10 trees generated utilizing the RF model. These trees are represented as Pythagorean trees, with each visualization corresponding to a randomly constructed tree. The most optimal tree is discerned by the brevity and vivid coloration of its branches, signifying that a minimal number of attributes effectively bifurcate the branches. The input data employed for the creation of the trees were predicated on a regression tree, culminating in the coloring of tree nodes in accordance with the standard deviation value.

3. Results and Discussion

3.1. Model Results

It is crucial to highlight that the preprocessed dataset was partitioned into two discrete subsets: The first subset, comprising 80% of the total 1373 entries within the database, was utilized for model training, while the remaining 20% of the 343 entries were designated for testing purposes. The outcomes produced by the developed machine learning models are presented in Table 3. The performance metrics for each model, including mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), and correlation coefficients (R²), are specified. The normalized predicted EHD for each model, alongside the normalized actual EHD obtained from downhole camera recordings, are visually depicted in Figure 9. A 45-degree line was employed to illustrate the divergence between the estimated values and the actual EHD. The predictions generated by the machine learning models were notably clustered near the 45-degree line, indicating a strong correlation with the actual values.

Figure 10 clarifies the essential parameters that significantly influence and impact the forecasting of EHD as per the AdaBoost model, recognized as the most accurate ML model in this research. For each parameter, the graph illustrates SHAP values (located on the horizontal axis) corresponding to every data instance (row) in the dataset. The SHAP value signifies the extent to which the feature value influences the predicted EHD in relation to the average prediction. Positive SHAP values (positions to the right of the center) indicate feature values that have a positive impact on the prediction of EHD. Conversely, negative values (positions to the left of the center) imply a negative influence on the EHD prediction. The colors represent the magnitude of each feature, with red signifying higher values and blue indicating lower values. The color scale is established based on all values within the dataset for a specific feature. Notably, parameters such as GD, CEL, COD, and CWT demonstrate a considerable effect on the projected EHD, whereas the remaining parameters exhibit a negligible influence. These findings reinforce the conclusions drawn from Pearson’s R and Spearman R methodologies as depicted in Figure 2.

3.2. Model Testing and Validation

The effectiveness of the ML models is evaluated using K-fold cross-validation alongside repeated random sampling methods. These approaches offer thorough assessment and validation frameworks, ensuring the dependability and applicability of the models.

K-fold cross-validation is recognized as a well-established methodology for assessing the performance of an ML system on a specific dataset. A singular implementation of the k-fold cross-validation technique may yield an unstable evaluation of model performance, as outcomes from various data partitions can vary significantly. The improvement of an ML model’s estimated performance can be achieved through the iterative application of k-fold cross-validation. By repetitively conducting the cross-validation procedure multiple times, an average result across all folds and iterations is reported. The mean outcome, calculated through the application of the standard error, is expected to provide a more accurate estimation of the true, concealed underlying mean performance of the model on the dataset. The results of the K-fold cross-validation process utilizing 10 folds are illustrated in Table 4 (the cross-validation procedure was reiterated a total of 10 times). Each model is presented alongside its MSE, RMSE, MAE, and R².

Repeated random sampling is acknowledged as an alternative approach for assessing the effectiveness of an ML algorithm applied to a dataset. This methodology includes the random partitioning of the data into training and testing subsets at a predetermined ratio (e.g., 80:20), and this entire procedure is repeated for a specified number of iterations. Table 5 presents the results obtained from a repeated random sampling technique that was executed for a total of 10 iterations. Each model is displayed alongside its MSE, RMSE, MAE, and R² values.

3.3. Field Application

Well-X refers to an onshore oil well situated in the Western Desert of Egypt. This well was vertically drilled to a depth of 7979 feet. Formation Y was perforated followed by the retrieval of a downhole camera, after which hydraulic fracturing treatment was performed. The findings derived from the downhole camera analysis, in conjunction with the parameters related to hydraulic fracturing, are comprehensively outlined in Table 6.

The data acquired from Well-X were utilized to implement the developed AdaBoost model, which is recognized as the most accurate algorithm, to forecast the actual EHD. In comparison to the actual EHD of 0.36 inches, the predicted EHD was calculated to be 0.34 inches. With an absolute deviation of 0.02 inches and a percentage error of 5%, it is evident that the predictions made by the model are highly accurate. Furthermore, the EHD was calculated to be 0.23 inches through the application of an empirical model based on the API RP43 perforation calculations [48]. Equation (1) was employed to compute the pressure drop across perforations for the actual EHD (0.36 inches), the calculated EHD (0.34 inches) derived from the AdaBoost-generated model, and the calculated EHD (0.23 inches) obtained via the empirical API RP43 model. The corresponding pressure drops across the perforations were computed to be 437, 550, and 2626 psi, respectively. These calculations underscore that an inaccurate estimation of the EHD by the empirical model may result in a substantially elevated estimation of perforation friction (2626 psi), which could consequently lead to a misleading modification of the hydraulic fracturing design or unexpected proppant screenout, especially in applications involving limited entry hydraulic fracturing.

4. Conclusions

This study showed the capability of machine learning and deep learning algorithms to accurately predict the perforation entry hole diameter (EHD) utilizing readily available parameters routinely gathered during perforation operations. The results illustrated the effectiveness of using a holistic approach, including different ML models.

According to the initial phase of data collection from 53 wells in the Western Desert of Egypt, it was obvious that the integration of a wide range of reservoir characteristics, gun specifications, and completion parameters significantly enhanced the model’s adaptability and predictive accuracy. The statistical analysis of the dataset revealed a direct correlation between gun diameter and EHD. This analysis showed the importance of selecting the gun diameter and its huge impact on the EHD.

During the feature ranking phase, Pearson’s and Spearman’s correlation coefficients showed the impact of gun diameter, reservoir UCS, and shot density on EHD. This valuable insight is a tool for completion engineers when optimizing perforation design. Moreover, it reveals the importance of selecting the correct input parameters to obtain the most accurate output.

Through the data preprocessing phase, the issues of data quality were addressed to achieve the most accurate output. The min-max normalization technique was used to ensure that no single feature dominates during the learning process, and that had an enormous effect on the accuracy of the ML models.

The utilization of different ML and DL models in the model structure phase showed distinct accuracy. For instance, Gradient Boosting (GB), Random Forest (RF), and AdaBoost revealed extremely high accuracy compared to traditional models like Linear Regression (LR) and Decision Trees (DTs). These results showed the importance of using multiple models during the study and that more complex models can effectively capture the underlying physical relationships between a wide range of parameters, thus providing an accurate prediction.

The performance of the ML models was evaluated through model testing and validation phases by K-fold cross-validation and repeated random sampling techniques. The K-fold cross-validation results of the top five models (GB, RF, AdaBoost, neural network (L-BFGS), and neural network (Adam)) revealed that the recorded MAPE values were 4.6%, 4.4%, 4.7%, 4.9%, and 6.3%, with corresponding R² values of 0.914, 0.913, 0.911, 0.887, and 0.789, respectively. These low MAPE and high R² values verify the remarkably high accuracy of the generated models.

Moreover, the anticipated EHD of Well-X, located in the Western Desert of Egypt, was estimated utilizing the developed AdaBoost model. The findings of the model showed a high degree of matching with the actual EHD obtained from a downhole camera with a low absolute deviation of 0.02 inches.

In conclusion, this study not only provides a comprehensive framework for employing machine learning in predicting EHD as an effective alternative for costly downhole camera deployment and inaccurate API tests but also illustrates the substantial impact of algorithm choice on the accuracy of the predicted EHD.

Author Contributions

Conceptualization, S.N.; methodology, S.L.; coding, A.G.; validation, A.G., O.E. and S.N.; formal analysis, S.L.; investigation, O.E.; resources, S.N.; data curation, S.N.; writing—original draft preparation, O.E.; writing—review and editing, S.N.; visualization, A.G.; supervision, R.M.; project administration, R.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.

Acknowledgments

The authors wish to convey their profound appreciation to the leadership of Khalda Petroleum Company for granting permission to disseminate this work. The leadership is appropriately recognized for their collaborative spirit and support, emphasizing the significance of their endorsement in enabling the publication of this research. Specifically, the authors extend their heartfelt thanks to the committed completion team, whose invaluable assistance during the data collection phase greatly enhanced the thoroughness and quality of this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

A	Perforation Area, in.²
Adam	adaptive moment estimation optimization algorithm
C_d	perforation entry hole discharge coefficient, dimensionless
CEL	casing elastic limit, psi
COD	casing outer diameter, in.²
CWT	casing nominal weight, lb/ft
D	perforation entry hole diameter in the casing, in
D	depth, ft
DL	deep learning
DT	Decision Tree
EHD	entry hole diameter, in.²
FG	fracture gradient, psi/ft
GB	Gradient Boosting
GD	gun diameter, in.
KNN	K-Nearest Neighbor
L-BFGS	limited-memory-broyden-fletcher-goldfarb-shanno optimization algorithm
LR	Linear Regression
MAE	mean absolute error
MAPE	mean absolute percent error
ML	machine learning
MSE	mean square error
N	number of perforations
NN	neural network
P_perf	perforation entry hole friction pressure, psi
Q	injection rate, bbl/min
R	correlation coefficient
R²	correlation coefficients
RD	rock density, g/cc
RF	Random Forest
RIH	run in hole
RMSE	root mean square error
SD	shot density, shot/ft
SGD	Stochastic Gradient Descent
SHAP	Shapley additive explanations
SP	shot phasing, degrees
SVMs	Support Vector Machines
UCS	reservoir unconfined compressive strength
ΔPp	perforation entry hole friction pressure, psi
ρ (rho)	fluid/slurry density, lb/gal

References

Liu, X.; Li, J.; Yang, H.; Liu, G.; Lian, W.; Wang, B.; Zhang, G. A new investigation on optimization of perforation key parameters based on physical experiment and numerical simulation. Energy Rep. 2022, 8, 13997–14008. [Google Scholar] [CrossRef]
Harris, M.H. The Effect of Perforating Oil Well Productivity. J. Pet. Technol. 1966, 18, 518–528. [Google Scholar] [CrossRef]
Hong, K.C. Productivity of Perforated Completions in Formations With or Without Damage. J. Pet. Technol. 1975, 27, 1027–1038. [Google Scholar] [CrossRef]
Karakas, M.; Tarlq, S.M. Semianalytical Productivity Models for Perforated Completions. SPE Prod. Eng. 1991, 6, 73–82. [Google Scholar] [CrossRef]
Harvey, J.; Grove, B.; Zhan, L.; Behrmann, L. New Predictive Model of Penetration Depth for Oilwell-Perforating Shaped Charges; OnePetro: Richardson, TX, USA, 2010. [Google Scholar]
Fituri, M.A.; Munoz, J.; Al Harbi, A.; Abouganem, A. Improving Well Productivity Through Better Perforation Design; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Zhang, R.; Wang, L.; Li, J.; Feng, C.; Zhang, Y. Numerical Analysis of Perforation during Hydraulic Fracture Initiation Based on Continuous–Discontinuous Element Method. CMES 2024, 140, 2103–2129. [Google Scholar] [CrossRef]
Waters, G.; Weng, X. The Impact of Geomechanics and Perforations on Hydraulic Fracture Initiation and Complexity in Horizontal Well Completions; OnePetro: Richardson, TX, USA, 2016. [Google Scholar]
Rasmuson, C.D.; Walden, J.T.; Smith, C.H.; Pinkett, J. Consistent Entry-Hole Diameter Perforating Charge Reduces Completion Pressure and Increases Proppant Placement; OnePetro: Richardson, TX, USA, 2015. [Google Scholar]
Wutherich, K.D.; Walker, K.J. Designing Completions in Horizontal Shale Gas Wells—Perforation Strategies; OnePetro: Richardson, TX, USA, 2012. [Google Scholar]
Angeles, R.; Tolman, R.; El-Rabaa, W.; Jackson, S.; Nygaard, K. Just-In-Time Perforating for Controlled, Cost-Effective Stimulation and Production Uplift of Unconventional Reservoirs; OnePetro: Richardson, TX, USA, 2012. [Google Scholar]
Simpson, G.; Mercer, A.; Mantell, M.; Bourgeois, C.; Battistel, A.; Pehlke, T.; Littleford, T. Virtually Unplugging Perforations: High-Resolution Acoustic Imaging Enabling Statistical Analysis of Calibration and Post-Frac Perforation Entry and Exit-Hole Datasets; OnePetro: Richardson, TX, USA, 2023. [Google Scholar]
Cramer, D.; Friehauf, K. Methods for Assessing Proppant Coverage Along the Lateral for Plug-and-Perf Treatments; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Tan, L.; Xie, L.; He, B.; Zhang, Y. Multi-Fracture Propagation Considering Perforation Erosion with Respect to Multi-Stage Fracturing in Shale Reservoirs. Energies 2024, 17, 828. [Google Scholar] [CrossRef]
Merry, H.; Dalamarinis, P. Multi-Basin Case Study of Real-Time Perforation Quality Assessment for Screen Out Mitigation and Treatment Design Optimization Using Tube Wave Measurements; OnePetro: Richardson, TX, USA, 2020. [Google Scholar]
Ranjan, V.; Vermani, S.; Goyal, A.; Pathak, S.; Goyal, R.; Camilo Casallas Gelvez, D.; Singh, A.; Pandey, S.; Roberts, G.; Mehta, R. Downhole Camera Run Validates Limited Entry Fracturing Technique and Improves Pay Coverage in Deep Tight Laminated Gas Reservoir of Western India; OnePetro: Richardson, TX, USA, 2022. [Google Scholar]
El-Rabaa, A.M.; Shah, S.N.; Lord, D.L. New Perforation Pressure Loss Correlations for Limited Entry Fracturing Treatments; OnePetro: Richardson, TX, USA, 1997. [Google Scholar]
Perforation friction modeling in limited entry fracturing using artificial neural network. Egypt. J. Pet. 2019, 28, 297–305. [CrossRef]
Wu, C.-H.; Sharma, M.M. Effect of Perforation Geometry and Orientation on Proppant Placement in Perforation Clusters in a Horizontal Well; OnePetro: Richardson, TX, USA, 2016. [Google Scholar]
Experimental study on the discharge coefficient of perforation behaviors during hydraulic fracturing treatments. Upstream Oil Gas Technol. 2023, 10, 100086. [CrossRef]
Shokry, A.; Mahmoud, A.A.; Elkatatny, S. Review of Remedial Cementing: Techniques, Innovations, and Practical Insights; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Dontsov, E.; Ponners, C.; Torbert, K.; McClure, M. Practical Optimization of Perforation Design with a General Correlation for Proppant and Slurry Transport from the Wellbore; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Sakaida, S.; Hamanaka, Y.; Zhu, D.; Hill, A.D.; Kerr, E.; Estrada, E.; Scofield, R.; Johnson, A. Evaluation of Fluid Containment and Perforation Erosion in Multistage Fracture Treatment; OnePetro: Richardson, TX, USA, 2023. [Google Scholar]
Horton, B. A Shot in the Dark: How Your Post-Fracture Perforation Imaging Can Be Misleading and How to Better Understand Cluster Efficiency and Optimize Limited Entry Perforating; OnePetro: Richardson, TX, USA, 2021. [Google Scholar]
Wu, M.; Zhu, J.; Li, L.; Li, P. Calculation of Perforated Vertical and Horizontal Well Productivity in Low-Permeability Reservoirs. SPE Drill. Complet. 2020, 35, 218–236. [Google Scholar] [CrossRef]
Ayre, D.; Atwood, D.; Geerts, S.; Grove, B.; Haggerty, D.; Hardesty, J.; Lattanzio, D.; McNelis, L.; Sampson, T.; Sokolove, C. API RP 19B Section 2 Perforation Tests Conducted at Multiple Facilities to Guide the Latest Section 2 Revision; OnePetro: Richardson, TX, USA, 2017. [Google Scholar]
Saucier, R.J.; Lands, J.F., Jr. A Laboratory Study of Perforations in Stressed Formation Rocks. J. Pet. Technol. 1978, 30, 1347–1353. [Google Scholar] [CrossRef]
Procyk, A.D.; Burton, R.C.; Atwood, D.C.; Grove, B.M. Optimized Cased and Perforated Completion Designs Through the Use of API RP-19B Laboratory Testing to Maximize Well Productivity; OnePetro: Richardson, TX, USA, 2012. [Google Scholar]
Haggerty, D.J.; Manning, J.D.; Nguyen, P.D.; Rickman, R.D.; Dusterhoft, R.G. Sand Consolidation Testing in an API RP 19B Section IV Perforation Flow Laboratory; OnePetro: Richardson, TX, USA, 2009. [Google Scholar]
Behie, A.; Settari, A. Perforation Design Models for Heterogeneous, Multiphase Flow; OnePetro: Richardson, TX, USA, 1993. [Google Scholar]
Venghiattis, A.A. Prediction of the Efficiency of a Perforator Down-Hole Based on Acoustic Logging Information. J. Pet. Technol. 1963, 15, 761–768. [Google Scholar] [CrossRef]
Grove, B.; Manning, D. Shaped Charge Perforation Depth at Full Downhole Conditions: New Understandings; OnePetro: Richardson, TX, USA, 2018. [Google Scholar]
Eliebid, M.; Hassan, A.; Mahmoud, M.; Abdulraheem, A. A New Approach to Quantify the Wellhead Performance for Gas Condensate Reservoirs Using Artificial Intelligent Techniques; OnePetro: Richardson, TX, USA, 2022. [Google Scholar]
Gharieb, A.; Gabry, M.A.; Elsawy, M.; Algarhy, A.; Ibrahim, A.F.; Darraj, N.; Sarker, M.R.; Adel, S. Data Analytics and Machine Learning Application for Reservoir Potential Prediction in Vuggy Carbonate Reservoirs Using Conventional Well Logging; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Thabet, S.A.; El-Hadydy, A.A.; Gabry, M.A. Machine Learning Models to Predict Pressure at a Coiled Tubing Nozzle’s Outlet During Nitrogen Lifting; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Gasser, M.; Naguib, A.; Abdelhafiz, M.; Elnekhaily, S.; Mahmoud, O. Artificial Neural Network Model to Predict Filtrate Invasion of Nanoparticle-Based Drilling Fluids. Trends Sci. 2023, 20, 6736. [Google Scholar] [CrossRef]
Gharieb, A.; Elshaafie, A.; Gabry, M.A.; Algarhy, A.; Elsawy, M.; Darraj, N. Exploring an Alternative Approach for Predicting Relative Permeability Curves from Production Data: A Comparative Analysis Employing Machine and Deep Learning Techniques; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Thabet, S.; Elhadidy, A.; Elshielh, M.; Taman, A.; Helmy, A.; Elnaggar, H.; Yehia, T. Machine Learning Models to Predict Total Skin Factor in Perforated Wells; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Zhang, H.; Shi, D.; Zha, S.; Wang, Q. A modified Fourier solution for sound-vibration analysis for composite laminated thin sector plate-cavity coupled system. Compos. Struct. 2019, 207, 560–575. [Google Scholar] [CrossRef]
Albasu, F.; Kulyabin, M.; Zhdanov, A.; Dolganov, A.; Ronkin, M.; Borisov, V.; Dorosinsky, L.; Constable, P.A.; Al-masni, M.A.; Maier, A. Electroretinogram Analysis Using a Short-Time Fourier Transform and Machine Learning Techniques. Bioengineering 2024, 11, 866. [Google Scholar] [CrossRef] [PubMed]
Sweiss, M.; Assi, S.; Barhoumi, L.; Al-Jumeily, D.; Watson, M.; Wilson, M.; Arnot, T.; Scott, R. Qualitative and quantitative evaluation of microalgal biomass using portable attenuated total reflectance-Fourier transform infrared spectroscopy and machine learning analytics. J. Chem. Technol. Biotechnol. 2024, 99, 92–108. [Google Scholar] [CrossRef]
Elkhatib, O.; Abdallah, M.; Elnaggar, H.; Hanamertani, A.S.; Al-Shalabi, E.; Ahmed, S. Huff-n-Puff Foam Injection in Naturally Fractured Carbonates Using Supercritical CO₂; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Gharieb, A.; Adel Gabry, M.; Algarhy, A.; Elsawy, M.; Darraj, N.; Adel, S.; Taha, M.; Hesham, A. Revealing Insights in Evaluating Tight Carbonate Reservoirs: Significant Discoveries via Statistical Modeling. An In-Depth Analysis Using Integrated Machine Learning Strategies; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Thabet, S.; Zidan, H.; Elhadidy, A.; Taman, A.; Helmy, A.; Elnaggar, H.; Yehia, T. Machine Learning Models to Predict Production Rate of Sucker Rod Pump Wells; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Kumar, K.S.; Avula, V.R.; Sharif, M.A.; Sagar, S.A.; Jasim, M.T.; Varma, K.G. Machine Learning-Based Regression Model for Detection of Petroleum Engineering Problems. In Proceedings of the 2024 International Conference on Recent Advances in Electrical, Electronics, Ubiquitous Communication, and Computational Intelligence (RAEEUCCI), Chennai, India, 17–18 April 2024; pp. 1–6. Available online: https://ieeexplore.ieee.org/document/10547969 (accessed on 8 September 2024).
Thabet, S.A.; Elhadidy, A.A.; Heikal, M.; Taman, A.; Yehia, T.A.; Elnaggar, H.; Mahmoud, O.; Helmy, A. Next-Gen Proppant Cleanout Operations: Machine Learning for Bottom-Hole Pressure Prediction; OnePetro: Richardson, TX, USA, 2024. [Google Scholar]
Keshavarzi, R.; Jahanbakhshi, R.; Nadgaran, H.; Aliyari, M. A Neural Network Approach for Predicting the Penetration Depth During Laser Perforation In Limestone; OnePetro: Richardson, TX, USA, 2010. [Google Scholar]
Ott, R.E.; Bell, W.T.; Harrigan, J.W.; Golian, T.G. Simple Method Predicts Downhole Shaped-Charge Gun Performance. SPE Prod. Facil. 1994, 9, 171–178. [Google Scholar] [CrossRef]

Figure 1. Perforating environment: actual downhole conditions (left); laboratory apparatus to yield comparable penetration performance (right). (Adapted from Grove and Manning, 2018) [32].

Figure 2. Close-up of perforating gun/casing/cement region in laboratory apparatus from Figure 1. (1) In-gun clearance, (2) Gun scallop thickness, (3) Fluid gap distance, (4) Casing plate thickness and (5) Cement annulus thickness (Adapted from Grove and Manning, 2018) [32].

Figure 3. A diagram of the methodology.

Figure 4. Pair plot of the total dataset used for EHD prediction.

Figure 5. Violin plots for each parameter in the dataset.

Figure 6. Feature ranking of the input parameters with entrance hole diameter.

Figure 7. Pearson’s correlation coefficient criteria are shown in the heat map of the total dataset used for the prediction of EHD.

Figure 8. Pythagorean Forest shows all learned Decision Tree models from the RF model.

Figure 9. Plots of normalized actual EHD versus predicted EHD by each algorism for (a) AdaBoost, (b) RF, (c) GB, (d) NN (L-BFGS), (e) NN (Adam), (f) SVMs, (g) KNN, (h) Tree, (i) LR, and (j) SGD.

Figure 10. SHAP plot of the AdaBoost model.

Table 1. Statistical analysis for the collected database.

Parameter	Unit	MIN	MAX	Average	Median
Entrance hole diameter	inches	0.18	0.58	0.33	0.32
Perforation depth	ft	5113	14,098	9679	10,052
Rock density	g/cc	1.6	3.3	2.28	2.3
Shot phasing	degrees	0	180	80.4	60
Shot density	shot/ft	4	12	7.5	6
Fracture pressure	psi	3815	12,006	7705	7321
Reservoir UCS	psi	1502	3745	2626	2704
Casing elastic limit	kpsi	55	88	66	55
Casing nominal weight	lb/ft	26	47	37	47
Casing OD	inches	7	9.625	8.4	9.625
Gun diameter	inches	2	4.5	3.6	3.4

Table 2. Summary of the ML models used.

Model	Description	Algorithm Parameters
GB	A methodological framework of ensemble learning that incrementally generates multiple Decision Trees. Each subsequent Decision Tree is instructed to rectify the deficiencies identified in its predecessor. The conclusive prediction is derived from the weighted aggregation of the forecasts produced by all Decision Trees.	Number of trees is 100. Learning rate is 0.099. Limit depth is 6. Minimum subset size is 2. The fraction of training instances is 1.
AdaBoost	Statistical classification meta-algorithm. The outcomes yielded by alternative learning algorithms, often referred to as “weak learners,” are amalgamated to form a weighted aggregate that signifies the ultimate results of the boosted classifier. AdaBoost exhibits adaptability in that it modifies subsequent weak learners to prioritize instances that previous classifiers misclassified. While individual learners may exhibit subpar performance, the aggregate model can be demonstrated to converge towards a robust learner, provided that each learner performs marginally better than mere chance.	The number of estimators is 100. Learning rate is 1. Regression loss function is linear.
RF	An ensemble learning methodology that generates a multitude of Decision Trees and amalgamates their predictions to yield a more accurate and dependable model. Each Decision Tree is trained utilizing a randomly selected subset of the training dataset and a randomly chosen subset of the features. The ultimate prediction is derived from the mean of all individual Decision Trees’ forecasts.	Number of trees in the forest is 10. Minimum subset size is 5.
SVMs	Support Vector Regression (SVR) endeavors to ascertain the most favorable hyperplane that maximizes the disparity between anticipated and actual values. To achieve this objective, the input features are projected into an elevated-dimensional space wherein the hyperplane can be delineated with greater precision.	SVM Cost is 1. Regression loss epsilon: 0.1. Kernel type is radial basis function. Iteration limit is 100.
DT	This methodology constitutes a coherent and comprehensible machine learning paradigm that generates a hierarchical tree structure to illustrate the relationships between the input data and the target variable. The primary nodes within the tree signify decisions derived from various features, whereas the terminal nodes denote expected outcomes.	Minimum instances in leaves are 19. Minimum subset size: 9. Maximal tree depth is 100. The stopping point is at 95% of the majority.
KNN	This methodology represents a supervised learning paradigm that is characterized by its non-parametric nature. The input consists of the k nearest training instances derived from a designated dataset. The resultant output of the K-Nearest Neighbors regression pertains to the attribute value of the object under consideration. This specific value is determined by calculating the mean of the values associated with the K-Nearest Neighbors.	Number of nearest neighbors is 5. Metric is Euclidean. Weights are uniform.
LR	A straightforward and extensively employed machine learning algorithm is utilized for the purpose of forecasting continuous numerical outcomes. It operates under the presumption of a linear correlation between the input variables and the targeted result. The primary aim of Linear Regression is to ascertain the optimal fitting line that minimizes the discrepancy between the predicted and observed values.	None.
NN—(L-BFGS)	This method emulates the functionality of biological human neural networks. It is commonly employed in nonlinear systems to replicate complex interactions between inputs and outputs. Neurons, which serve as the fundamental components of the network, are organized in layers and interconnected through weights. The network undergoes a learning or adaptation process when adjustments are made to the weights, enabling the network to yield accurate outputs. The linear combination of the inputs corresponds to the product of the weights and the inputs. L-BFGS represents an optimization technique within the quasi-Newton methods category, which approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) while utilizing a limited computational memory footprint.	Neurons per hidden layer are 1500. Activation is ReLu. Solver is L-BFGS. Maximum iterations are 500.
NN—(Adam)	This approach replicates the operations of biological human neural networks. It is frequently applied within nonlinear frameworks to model sophisticated interactions among inputs and outputs. Neurons, which constitute the core elements of the network, are assembled in layers and interlinked via weights. The network learns or adjusts as the weights are modified to ensure the generation of correct outputs. The linear combination of the inputs is equivalent to the product of the weights and the inputs. Adam is an iterative optimization algorithm employed for the purpose of minimizing the loss function during the training phase of neural networks.	Neurons per hidden layer are 1500. Activation is ReLu. The solver is Adam. Maximum iterations are 500.
SGD	An iterative procedure for optimizing an objective function characterized by adequate smoothness properties. It can be conceptualized as a stochastic approximation of gradient descent optimization, as it substitutes an estimated gradient for the actual gradient, which is derived from the complete dataset.	Regression loss function is hinge. Regularization is none. Regularization strength is 0. Learning rate is constant. Initial learning rate is 0.01. Number of iterations is 1000.

Table 3. Results of the developed ML models.

Model	MSE	RMSE	MAE	R²
AdaBoost	0.002	0.046	0.037	0.938
Random Forest	0.002	0.047	0.037	0.933
Gradient Boosting	0.002	0.048	0.038	0.93
Neural network (L-BFGS)	0.003	0.051	0.037	0.922
Neural network (Adam)	0.006	0.077	0.056	0.822
SVM	0.01	0.1	0.068	0.696
kNN	0.022	0.149	0.093	0.335
Tree	0.026	0.16	0.085	0.226
Linear Regression	0.028	0.168	0.112	0.146
SGD	0.031	0.176	0.112	0.065

Table 4. Result of the K-fold cross-validation procedure.

Model	MSE	RMSE	MAE	R²
Gradient Boosting	0.003	0.057	0.046	0.914
Random Forest	0.003	0.057	0.044	0.913
AdaBoost	0.003	0.058	0.047	0.911
Neural network (L-BFGS)	0.004	0.065	0.049	0.887
Neural network (Adam)	0.008	0.089	0.063	0.789
SVM	0.016	0.127	0.085	0.568
kNN	0.037	0.193	0.129	0.002
Tree	0.038	0.194	0.127	0.002
SGD	0.038	0.195	0.136	0.001
Linear Regression	0.038	0.195	0.138	0.001

Table 5. Result of random sampling procedure.

Model	MSE	RMSE	MAE	R²
Random Forest	0.003	0.058	0.044	0.907
AdaBoost	0.004	0.06	0.048	0.901
Gradient Boosting	0.004	0.06	0.048	0.9
Neural network (L-BFGS)	0.01	0.099	0.062	0.728
Neural network (Adam)	0.011	0.105	0.071	0.696
SVM	18	0.136	0.09	0.489
Tree	0.032	0.18	0.128	0.103
Linear Regression	0.036	0.19	0.136	0.001
SGD	0.038	0.194	0.137	0.001
kNN	0.038	0.196	0.133	0.001

Table 6. Field case Well-X input parameters.

	Unit	Value
Actual entry hole diameter	inches	0.36
Perforation depth	ft	7979
Rock density	g/cc	1.6
Shot phasing	degrees	60
Shot density	shot/ft	4
Fracture pressure	psi	6542
Reservoir UCS	psi	3166
Casing elastic limit	kpsi	55
Casing nominal weight	lb/ft	47
Casing OD	inches	9.625
Gun diameter	inches	4.5
Pumping rate	bbl/min	80
Frac fluid density	ppg	8.55
Discharge coefficient	-	0.7
Number of perforations	-	60

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nashed, S.; Lnu, S.; Guezei, A.; Ejehu, O.; Moghanloo, R. Downhole Camera Runs Validate the Capability of Machine Learning Models to Accurately Predict Perforation Entry Hole Diameter. Energies 2024, 17, 5558. https://doi.org/10.3390/en17225558

AMA Style

Nashed S, Lnu S, Guezei A, Ejehu O, Moghanloo R. Downhole Camera Runs Validate the Capability of Machine Learning Models to Accurately Predict Perforation Entry Hole Diameter. Energies. 2024; 17(22):5558. https://doi.org/10.3390/en17225558

Chicago/Turabian Style

Nashed, Samuel, Srijan Lnu, Abdelali Guezei, Oluchi Ejehu, and Rouzbeh Moghanloo. 2024. "Downhole Camera Runs Validate the Capability of Machine Learning Models to Accurately Predict Perforation Entry Hole Diameter" Energies 17, no. 22: 5558. https://doi.org/10.3390/en17225558

APA Style

Nashed, S., Lnu, S., Guezei, A., Ejehu, O., & Moghanloo, R. (2024). Downhole Camera Runs Validate the Capability of Machine Learning Models to Accurately Predict Perforation Entry Hole Diameter. Energies, 17(22), 5558. https://doi.org/10.3390/en17225558

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Downhole Camera Runs Validate the Capability of Machine Learning Models to Accurately Predict Perforation Entry Hole Diameter

Abstract

1. Introduction

1.1. The Importance of Predicting Perforation Entry Hole Diameter

1.2. Traditional Prediction Methods

1.3. Machine Learning Models for Predicting EHD

2. Methodology

2.1. Data Collection

2.2. Feature Ranking

2.3. Data Preprocessing

2.4. Models Structure

3. Results and Discussion

3.1. Model Results

3.2. Model Testing and Validation

3.3. Field Application

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI