Next Article in Journal
Reducing Energy Penalty in Wastewater Treatment: Fe-Cu-Modified MWCNT Electrodes for Low-Voltage Electrofiltration of OMC
Previous Article in Journal
Study on the Effect of Sampling Frequency on Power Quality Parameters in a Real Low-Voltage DC Microgrid
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Construction of Analogy Indicator System and Machine-Learning-Based Optimization of Analogy Methods for Oilfield Development Projects

1
Research Institute of Petroleum Exploration & Development, PetroChina, Beijing 100083, China
2
The First Natural Gas Plant of PetroChina Qinghai Oilfield Company, Golmud 816000, China
*
Authors to whom correspondence should be addressed.
Energies 2025, 18(15), 4076; https://doi.org/10.3390/en18154076 (registering DOI)
Submission received: 27 June 2025 / Revised: 23 July 2025 / Accepted: 25 July 2025 / Published: 1 August 2025
(This article belongs to the Section H1: Petroleum Engineering)

Abstract

Oil and gas development is characterized by high technical complexity, strong interdisciplinarity, long investment cycles, and significant uncertainty. To meet the need for quick evaluation of overseas oilfield projects with limited data and experience, this study develops an analogy indicator system and tests multiple machine-learning algorithms on two analogy tasks to identify the optimal method. Using an initial set of basic indicators and a database of 1436 oilfield samples, a combined subjective–objective weighting strategy that integrates statistical methods with expert judgment is used to select, classify, and assign weights to the indicators. This process results in 26 key indicators for practical analogy analysis. Single-indicator and whole-asset analogy experiments are then performed with five standard machine-learning algorithms—support vector machine (SVM), random forest (RF), backpropagation neural network (BP), k-nearest neighbor (KNN), and decision tree (DT). Results show that SVM achieves classification accuracies of 86% and 95% in medium-high permeability sandstone oilfields, respectively, greatly surpassing other methods. These results demonstrate the effectiveness of the proposed indicator system and methodology, providing efficient and objective technical support for evaluating and making decisions on overseas oilfield development projects.

1. Introduction

The complex process of oil and gas development involves high technical demands, strong interdisciplinarity, lengthy investment cycles, significant uncertainty, and substantial capital commitments [1]. Assessing potential development projects is further complicated by various project types, limited data availability, and tight decision-making timelines [2,3,4,5,6].
The reservoir analogy, which derives the potential of a new project from the experience of developed fields, plays a crucial role in reducing geological risk, optimizing development plans, and speeding up decision-making. It is particularly helpful during pre-development and early production stages and is most effective when used with newly discovered reservoirs within or near mature oil and gas fields. In the evaluation of new oilfield development projects, especially those overseas lacking operational experience or enough data, reservoir analogy offers guidance for reserves estimation, production forecasting, and strategy development. Both the U.S. Securities and Exchange Commission (SEC) regulations and the Society of Petroleum Engineers’ (SPE) Petroleum Resources Management System (PRMS) standard explicitly support using reliable analogies for reserves assessment when direct data are limited [7,8].
Analogy methods have been proposed or applied in reservoir characterization [9], seismic attribute analysis [10], and resource assessment [11]. With the widespread use of machine learning in petroleum engineering and its strong modeling ability and potential for generalization in reservoir identification, parameter prediction, and development scheme optimization, an increasing number of studies have used machine-learning techniques for analogy and prediction of various oilfield indicators [12,13,14,15,16,17,18]. Bai et al. [19] developed productivity prediction models using linear regression (LR), random forest regression (RF), support vector regression (SVR), backpropagation neural networks (BP), extreme gradient boosting (XGBoost), and LightGBM, demonstrating the role of machine learning in oilfield analogy and production forecasting. Guo et al. [20] introduced a new analogy and machine-learning approach for predicting reservoir permeability. Mahdaviara et al. [21] created a tool using statistical and machine-learning methods to evaluate and screen enhanced oil recovery (EOR) scenarios for low-permeability reservoirs. Rahimi and Riahi [22] classified offshore reservoir facies based on logging data with the RF method.
However, limitations remain in the current analogy practices used in oilfield development. The choice of analogy processes and parameters still heavily depends on expert knowledge and subjective judgment, lacking a systematic and objective evaluation framework. Oilfield projects involve numerous parameters, including geological, reservoir, fluid, reserves, and development factors. Because of the complexity in evaluating multiple parameters, many analogy studies focus on only a limited subset or a single category of parameters for similarity assessment, leading to the absence of a comprehensive and systematic indicator framework [23,24,25,26,27,28]. The weighting of analogy indicators is traditionally based on expert scoring or the Analytic Hierarchy Process (AHP). These methods, being highly subjective and poorly standardized, often produce significant discrepancies among different technical personnel, thus reducing comparability and reproducibility. Additionally, most existing studies on analogy methods concentrate on individual oilfield indicators, while research on analogy at the level of whole asset evaluation remains limited.
To address these challenges, this study aims to develop a comprehensive analogy indicator system specifically designed for overseas oilfield development projects and to optimize analogy methods using multiple machine-learning algorithms. First, raw indicators are extracted and categorized from commercial databases and representative development projects. Next, a set of key analogy indicators is generated through screening, classification, and weighting processes. Five commonly used machine-learning algorithms are then employed to model these key indicators and conduct both single-indicator and whole-asset analogy experiments, with performance evaluated based on adaptability and predictive accuracy. The experimental results validate the effectiveness of the proposed indicator system and identify support vector machine (SVM) as the most appropriate algorithm for this application.
The main contributions of this study are as follows:
  • A range of statistical techniques, including the correlation coefficient, systematic clustering, and principal component analysis, are used to screen the original set of indicators and identify key analogy indicators. A classification scheme is then created for the selected key indicators based on probability statistics analysis and expert judgment, ensuring both representativeness and engineering relevance.
  • A combined subjective–objective weighting method is proposed for key indicators. Subjective weights are assigned using direct expert scoring, while objective weights are derived from the averaged results of the entropy method and the coefficient of variation method. This approach ensures that the weighting reflects both expert experience and data characteristics.
  • Through screening, classification, and weighting, a comprehensive analogy indicator system is developed. It combines static and dynamic parameters from geological, petrophysical, and development aspects, integrating both subjective and objective views to support similarity evaluation between target and candidate oilfields.
  • Five machine-learning methods—support vector machine (SVM), random forest (RF), backpropagation neural network (BP), k-nearest neighbor (KNN), and decision tree (DT)—are used to perform both single indicator and whole analogy experiments. The adaptability and prediction accuracy of each method are evaluated under different reservoir conditions, such as medium-to-high permeability sandstone and low-permeability sandstone, leading to the identification of the optimal algorithm.
The paper is organized as follows. Section 2 provides a detailed discussion of the methodology and procedures for constructing the analogy indicator system and selecting analogy methods for oilfield development projects. Section 3 presents and analyzes the application of the proposed analogy indicator system and the selection of analogy methods, using real data from oilfield projects. Finally, Section 4 summarizes the main findings of the study.

2. Materials and Methods

Focusing on analogy analysis for oilfield development projects, the study includes two main components: the construction of analogy indicators and the optimization of analogy methods, as shown in Figure 1.
In the indicator construction component, technical indicators and representative oilfield data were first categorized and organized to create a base indicator system and database tailored to the characteristics of overseas oilfield development projects. Based on the requirements of the analogy task, key analogy indicators were then identified through a series of steps, including indicator screening, classification, and weighting, to prepare for subsequent analogy procedures. In the analogy method component, machine-learning algorithms were applied to perform both single-indicator and whole-asset analogy experiments for new oilfield development projects, to identify the most suitable method to meet the needs of such projects.

2.1. Analogy Indicator System for Oilfield Development Projects

The initial process for selecting basic analogy indicators involved several steps. First, relevant data were collected from multiple sources, including commercial databases like the C&C Reservoir database (http://www.ccreservoirs.com, accessed on 27 June 2025), the IHS Markit Energy Portal (https://energyportal.ci.spglobal.com, accessed on 27 June 2025), and the Wood Mackenzie database (https://www.woodmac.com, accessed on 27 June 2025), as well as historical project records and reviewed literature. Second, the objective analysis method was used to identify key factors that influence project evaluation, ensuring the indicators were comprehensive, complete, and easily interpretable. Third, the key factors were grouped into different indicator sets based on their technical features. Finally, expert consultation was conducted to validate the results, resulting in the basic analogy indicators.
A total of 36 basic analogy indicators were initially chosen, covering both static and development parameters. The static parameters include reservoir properties, trap and structural characteristics, fluid properties, and reserve parameters. Among these, eight are qualitative indicators (highlighted in blue), and twenty-eight are quantitative indicators (shown in black), as shown in Figure 2.
Among the 36 selected indicators, static parameters directly represent key geophysical and petrophysical attributes that govern reservoir performance. Reservoir properties such as lithology, average porosity, and permeability quantify storage capacity, while trap and structural characteristics define the reservoir framework. Fluid properties control fluid mobility and drive mechanisms, and reserve parameters measure volumetric resource potential [29]. Development parameters directly reflect production-process performance and reservoir drive characteristics. By integrating static and development parameters, the basic analogy indicators are firmly grounded in reservoir characteristics and field production performance, thus providing a comprehensive and reliable basis for analogy in oilfield development projects.
Using these 36 indicators, data from various oilfields were compiled and screened to build a basic analogy indicator database for different oilfield types. This database includes data from 1436 oilfields, categorized by region (onshore and offshore) and lithology (sandstone and carbonate).
Among the thirty-six basic analogy indicators, eight qualitative indicators—derived from oilfield classification standards or expert judgment—have clear categorical distinctions and classification functions. As categorical variables, they are not suitable for quantitative analysis and were therefore directly included in the construction of the analogy indicator system. For the remaining 28 quantitative indicators, statistical methods were used for screening, classification, and weight calculation to establish a more scientifically based key analogy indicator system. This system provides parameters and a foundation for future research on analogy methods for oilfield development projects. The flowchart for constructing the key analogy indicator system is shown in Figure 3.

2.1.1. Key Indicator Screening

The initially selected basic analogy indicators aimed to comprehensively capture all key factors characterizing oilfield development projects and influencing project evaluation. However, practical issues may arise with these indicators, such as overlapping or correlated parameters, parameters that do not clearly reflect evaluation-relevant features, and the need to clarify each parameter’s relative importance. Therefore, a secondary screening of the basic indicators is necessary to reduce redundancy and correlation among them and to identify the key indicators for analogy analysis.
In this study, three statistical methods were used for key indicator screening: the correlation coefficient method, the systematic clustering method, and the principal component analysis method.
Correlation Coefficient Method
The correlation coefficient is a statistical metric used to measure the strength of the relationship between two variables. It is calculated using the product-moment method, which is based on the deviations of each variable from its respective mean. The degree of correlation between the two variables is reflected by the product of these deviations.
Given two variables, x = x 1 , x 2 , , x k , x n , y = y 1 , y 2 , , y k , y n , the correlation coefficient r x y is calculated using the following formula:
r x y = k = 1 n x k x ¯ y k y ¯ k = 1 n x k x ¯ 2 k = 1 n y k y ¯ 2   ,
where x ¯ and y ¯ represent the mean values of variables x and y , respectively. The absolute value of the correlation coefficient | r x y | reflects the degree of similarity between the two variables.
Systematic Clustering Method
Cluster analysis groups sample data based on individual characteristics of the research objects according to predefined classification criteria. Systematic clustering is a hierarchical classification method based on distance metrics, with the core idea of iteratively merging similar objects to form hierarchical groupings from fine to coarse levels. In this method, each sample or variable is initially treated as an individual class. The algorithm then repeatedly calculates inter-class distances and merges the two closest classes until all objects are grouped into a single cluster. This process can be visually represented by a dendrogram, allowing researchers to select an appropriate distance threshold based on practical needs to determine the final clustering scheme. Systematic clustering does not require a predefined number of clusters and can naturally reveal the data’s hierarchical structure. However, its computational complexity increases significantly with the number of samples, making it more suitable for exploratory analysis of small to medium-sized datasets.
In variable clustering studies, the systematic clustering method is often combined with the correlation coefficient to identify groups of key indicators with similar variation patterns. For clustering analysis, the correlation coefficient was converted into a distance metric as follows:
d x y = 1 r x y ,
where d x y denotes the distance between variables x and y, calculated as one minus the absolute value of their correlation coefficient | r x y | . A smaller value of d x y indicates a higher similarity between variables x and y, and such variables should be clustered together with higher priority.
Systematic clustering groups data by iteratively merging similar classes, with its core process being the definition of inter-class distances. This study adopts the Single Linkage method. The specific steps are as follows:
  • Each sample or variable is initially treated as an independent class, denoted as G1, G2, …, Gₙ. All pairwise distances are calculated to form the initial distance matrix D(0), where the element Dij = dxy represents the distance between variables.
  • The minimum distance element in the current matrix is selected as Dpq = min {Dij}, and the corresponding classes Gp and Gq are merged to form a new class Gs.
  • The distance matrix is updated by calculating the distance between the new class Gs and any other class Gk as Dsk = min {Dpk, Dqk}, where Dpk and Dqk are the distances between the original classes Gp, Gq, and class Gk.
  • The above steps are repeated until all classes are merged into a single class. The entire clustering process can be visualized using a dendrogram, and researchers can determine the final classification scheme by selecting an appropriate distance threshold based on practical needs.
Principal Component Analysis Method
Factor analysis is a multivariate dimensionality reduction method aimed at extracting a small number of latent, unobservable common factors from a set of correlated observed variables. The goal is to simplify the data structure and reveal the intrinsic relationships among variables while retaining as much original information as possible. Factor analysis explains most of the variation in the original observed variables through common factors. It assumes that the original variables X 1 , X 2 , , X p can be expressed as a linear combination of k latent factors F 1 , F 2 , , F k (where k p ), along with a unique error term ε i associated with each variable. The mathematical model of factor analysis can thus be expressed as:
X i = a i 1 F 1 + a i 2 F 2 + + a i k F k + ε i ,    i = 1 , 2 , , p .
It can also be expressed in matrix form as:
X = A F + ε ,
where A = a i j is the factor loading matrix, representing the degree of correlation between the original variables and the extracted factors.
We extract principal components using principal component analysis and employ them as factor inputs in the subsequent modeling. The specific steps for determining factor variables using principal component analysis in this study are as follows:
  • Standardize the original variables to zero mean and unit variance in order to eliminate the influence of differing units and scales.
  • Construct the correlation matrix R for the original variables.
  • Perform eigenvalue decomposition on the correlation matrix R. Let U  be the matrix composed of the top  k principal eigenvectors, and Λ be the diagonal matrix of the corresponding eigenvalues. The factor loading matrix A can then be computed as:
A = U Λ 1 / 2
where a i j = u i j λ j , representing the loading of variable X i on factor F j , i.e., the degree to which factor F j explains the variance of variable X i .
4.
Determine the number of factors k based on the cumulative variance contribution rate, choosing the smallest k such that the cumulative explained variance is at least 90%.
5.
Linearly transform the original variables into factor scores to be used as inputs for subsequent modeling. Let wⱼᵢ denote the weight of variable Xᵢ on factor Fⱼ, and the factor score, which is a weighted sum of all variables on a given factor, can be calculated as follows:
F j = w j 1 X 1 + w j 2 X 2 + + w j p X p .

2.1.2. Key Indicators Classification

After identifying the key analogy indicators, it is necessary to further classify and assign weights to them to support both single indicator and whole asset analogy tasks. In this study, quantitative indicators were classified using both probability distribution curves and expert knowledge, ensuring a balance between objective data features and subjective domain experience. This method improves the credibility of the classification results by considering both expert insights and variability from random sampling.
Based on the probabilistic analysis of key quantitative indicators, the distribution patterns were categorized into the following types:
1.
Normal distribution type: This includes indicators such as Average Porosity, Reservoir Burial Depth, Initial Reservoir Pressure, Oil API Gravity, Original Oil in Place, and Recovery Factor.
For normally distributed indicators, the classification thresholds were calculated based on the mean ( μ ) and standard deviation ( σ ), using the values: μ σ , μ σ / 2 , μ + σ / 2 , μ + σ .
2.
Exponential distribution type: This category includes Net Pay Thickness, Average Permeability, Oil Volume Factor, Bubble Point Pressure, Reserves Abundance, Well Pattern Density, Initial Production per Well, Peak Production Rate, and Oil Production Rate.
For exponentially distributed indicators, the classification thresholds were determined based on the characteristic quantiles extracted from the cumulative distribution function (CDF) at probability levels of 15%, 30%, 70%, and 85%.
3.
Uniform distribution type: This category includes Average Net-to-Gross Ratio and Recovery Efficiency of Reserves.
For uniformly distributed indicators, the classification thresholds were defined by the characteristic quantiles extracted from the CDF at probability levels of 20%, 40%, 60%, and 80%.

2.1.3. Key Indicator Weighting

In analogy-based evaluation of new oilfield development projects, different analogy objectives relate to various key influencing factors and their associated weights. The weight assigned to each key indicator indicates its relative contribution to the final evaluation result. Common weighting methods are divided into three categories: subjective methods (such as analytic hierarchy process, expert scoring), objective methods (such as entropy method, coefficient of variation method), and combined subjective–objective methods.
Subjective methods reflect the preferences and judgments of decision-makers, but they often lack objectivity. Objective methods depend on the inherent structure of the data but may miss the practical significance of indicators. To balance expert opinions with data-driven insights, this study uses a combined subjective–objective weighting approach. For the subjective part, expert scoring was used, where domain specialists assigned scores to each indicator based on engineering experience and the analogy objective, producing the subjective weights. For the objective part, both the entropy method and the coefficient of variation method were applied to calculate the objective weights, and their average was used as the final objective value.
Entropy Method
The entropy method is an objective weighting technique based on information entropy theory. Entropy measures uncertainty, with an indicator’s entropy value indicating its degree of dispersion. The lower the entropy value, the higher the dispersion and the greater the impact of that indicator on the overall evaluation. The steps for calculating the entropy method are as follows:
1.
Normalize the data. Let i  denote the sample index and j denote the indicator index. Let X i j represent the original value of the j -th indicator for the i -th sample. Denote the minimum and maximum values of all samples for the j -th indicator as m i n X j and m a x X j , respectively. Each X i j is linearly mapped to the interval [0, 1], resulting in the normalized value X i j .
For a positive indicator, the normalization is computed as follows:
X i j = X i j m i n X j m a x X j m i n X j .
For a negative indicator, the normalization is computed as follows:
X i j = m a x X j X i j m a x X j m i n X j .
2.
Calculate the proportion of the j -th indicator in the i -th sample, denoted as Y i j , using the following formula:
Y i j = X i j i = 1 m X i j .
3.
Compute the entropy value of the j -th indicator. Let m be the total number of samples and k = 1 ln m be the normalization constant. The entropy value e j is calculated as follows:
e j = k i = 1 m Y i j ln Y i j .
4.
Compute the redundancy d j  and derive the weight of each indicator. The redundancy is given by d j = 1 e j , and the final weight w j e n t is calculated as follows:
w j e n t = d j j = 1 n d j .
Coefficient of Variation Method
The coefficient of variation method is an objective weighting approach based on the degree of variability of indicator data. The coefficient of variation is a relative measure of data dispersion, calculated as the standard deviation divided by the mean. This method removes the effects of units and magnitude. Because of its properties, the coefficient of variation measures how much an indicator varies relative to others. A higher coefficient indicates more dispersion, meaning the indicator has a greater influence on the overall evaluation.
Let the mean and standard deviation of the j -th indicator across all samples be denoted as x j ¯ and s j , respectively. The coefficient of variation σ j is calculated using the following formula:
σ j = s j x j ¯ .
The weight of indicator j based on the coefficient of variation method, denoted as w j c v , is calculated using the following formula:
w j c v = σ j j = 1 n σ j .
Combined Subjective–Objective Weighting Method
This study adopts a linear weighting approach to combine subjective and objective methods. Let b j denote the subjective weight of indicator j , obtained using the expert direct rating method. Let α , β , and γ represent the weight coefficients for the subjective method, entropy method, and coefficient of variation method, respectively. The final weight W j for each indicator is calculated by linearly combining the subjective weight and the two objective weights, as shown below:
W j = α b j + β w j e n t + γ w j c v .
In this study, the coefficients are set as α = 0.5 , β = γ = 0.25 .

2.2. Analogy Methods for Oilfield Development Projects

Analogy, also referred to as analogical reasoning, is a method of inference that deduces the presence of certain properties in a target entity based on the known existence of similar properties in a comparable reference. The validity of such inference must be empirically verified, and the more attributes the two entities share, the more reliable the analogical conclusion becomes.
In oilfield development projects, analogy involves selecting similar reference projects to evaluate a new development project through comparison and inference, thereby enabling rapid screening and evaluation. In this study, two analogy approaches are employed: (1) single indicator analogy, which supplies reference values for missing indicators in the target asset, and (2) whole asset analogy, which qualitatively evaluates the overall potential and value of the target project. Both approaches are implemented using five machine-learning algorithms: support vector machine (SVM), random forest (RF), backpropagation neural network (BP), k-nearest neighbor (KNN), and decision tree (DT). The workflow of the analogy method is shown in Figure 4.

2.2.1. Machine-Learning Methods

Machine learning is an interdisciplinary field that combines statistics, artificial intelligence, and computer science. It is also called predictive analytics or statistical learning. Machine-learning algorithms find patterns and features directly from data using computational methods, without depending on predetermined equations. As the number of training samples grows, these algorithms can improve their performance adaptively. The main idea is to train models to discover the underlying patterns and key rules of a phenomenon, ultimately enabling prediction or decision-making.
This study builds on the previously established key analogy indicator system and uses five machine-learning methods to predict key indicators and assess the whole asset level of oilfield development projects. A comparative overview of the characteristics of these methods is shown in Table 1.

2.2.2. Single Indicator Analogy

According to oilfield classification, the key indicator system was used to comprehensively include both reservoir static parameters and development parameters. Machine-learning methods were then applied to predict the target indicator. In this study, experiments were carried out separately for two types of oilfields: onshore medium-to-high permeability sandstone reservoirs and onshore low-permeability sandstone reservoirs [30].
Before the analogy process, data preprocessing was performed on the key indicator set. Indicators with more than 70% sample coverage were chosen as feature variables. Using the recovery factor as the target variable for the analogy, seven key indicators were selected as features: well pattern density, oil API gravity, original oil in place, reserves abundance, average porosity, net pay thickness, and reservoir burial depth. After removing outliers, imputing missing values with the median, and normalizing the data, 663 samples from medium-to-high permeability sandstone oilfields and 157 samples from low-permeability oilfields were retained.

2.2.3. Whole Asset Analogy

Using conventionally classified oilfield asset categories as the reference standard and incorporating the selected key indicators, machine-learning methods were applied to classify asset levels for onshore medium-to-high permeability sandstone oilfields, with a total of 663 samples. Based on oilfield characteristics, development potential, and economic benefits, the assets were divided into five levels, with Level 1 representing the highest priority. Compared to the limited sample size of low-permeability oilfields, the larger dataset of medium-to-high permeability oilfields is more suitable for training and optimizing machine-learning models, making it ideal for whole asset analogy and selection.
In the analogy indicator system for oilfield development projects, nine features were chosen: well pattern density, oil API gravity, original oil in place, reserves abundance, average porosity, average permeability, net pay thickness, reservoir burial depth, and recovery factor. These were used as inputs for the five machine-learning methods mentioned earlier to classify and prioritize target oilfield assets.
In the conventional empirical classification, asset categories are determined via a linear weighted scoring procedure applied to the n selected key quantitative indicators, comprising the following steps:
1.
Construct the evaluation matrix by arranging the n  indicators as columns and the 663 oilfields as rows.
2.
Normalize each indicator to the [0, 1] range and multiply by its obtained key-indicator weight.
3.
Compute each oilfield’s analogy score S  by summing the weighted, normalized scores.
4.
Plot the score histogram and, based on empirical distribution characteristics, divide S into five intervals to define the asset levels.
The histogram of analogy scores for the 663 medium-to-high permeability sandstone oilfields is shown in Figure 5. According to conventional empirical classification, the five asset levels are defined as Level 1 for S ≥ 0.58, Level 2 for 0.53 ≤ S < 0.58, Level 3 for 0.43 ≤ S < 0.53, Level 4 for 0.38 ≤ S < 0.43, and Level 5 for S < 0.38.
All analyses were conducted in Python 3.8, using NumPy, pandas, and SciPy for data handling and statistical computations, scikit-learn for machine-learning tasks, and Matplotlib 3.3.2 for visualization.

3. Results and Analysis

3.1. Construction of the Analogy Indicator System

3.1.1. Results of Key Indicator Screening

Correlation Coefficient Method
Among the twenty-eight basic quantitative indicators, three indicators—pressure coefficient, reservoir oil density, and surface oil viscosity—had a high proportion of missing data in the collected dataset and were therefore excluded from the correlation analysis. The remaining 25 indicators were divided into two categories: static parameters and development parameters. Correlation analysis was conducted separately for each category. The correlation coefficient matrix of static parameters is shown in Table 2, and that of development parameters is shown in Table 3.
The correlation coefficient is used to assess the degree of linear association between two variables. When 0 < | r x y | < 1, it indicates that a certain degree of linear correlation exists between the two variables. The closer | r x y | is to 1, the stronger the linear relationship between them. Conversely, the closer | r x y | is to 0, the weaker the linear correlation. In general, the correlation coefficient is interpreted in three levels: | r x y | < 0.4 indicates low linear correlation, 0.4 ≤ | r x y | < 0.7 indicates significant correlation, and 0.7 ≤ | r x y | < 1 indicates high linear correlation.
However, due to differences in sample size, the number of variables, their characteristics, and the relationships among them, the threshold values used to determine the degree of correlation may vary in different situations. In some cases, a correlation coefficient greater than 0.5 is regarded as indicating good correlation, whereas in others, a threshold of 0.8 or even higher is required. We selected a threshold of 0.6. This choice balances the need for strong internal consistency within each indicator group against the requirement for clear separation between groups. Preliminary grouping experiments on our dataset showed that using a threshold of 0.6 produced stable clusters with high repeatability and minimal cross-group correlation.
In this study, based on the correlation coefficient (0.6 ≤ | r x y | < 1), the static parameter indicators were classified into three groups. Within each group, indicators show strong internal correlation, while the correlation between groups is relatively weak. The first group includes Reservoir Burial Depth, Initial Reservoir Pressure, and Initial Reservoir Temperature. The second group consists of Initial Gas–Oil Ratio, Oil Volume Factor, and Bubble Point Pressure. The third group includes Reservoir Area, Original Oil in Place, and Recoverable Oil Reserves. The remaining eight static parameters were treated as independent indicators. The pairwise correlation coefficients of these eight development parameters were below 0.6, indicating no significant correlation, and they were therefore treated as independent indicators. The detailed classification of parameters is shown in Table 4.
Systematic Clustering Method
Systematic clustering using the Single Linkage method was performed separately for static and development parameters, and the corresponding dendrograms are shown in Figure 6 and Figure 7. In the dendrogram, a smaller distance indicates a higher tendency for variables to cluster together, and indicators within the same cluster tend to have strong internal correlation and redundancy. The clustering results for both static and development parameters are summarized with cluster thresholds ranging from 2 to 5. Indicators grouped in the same cluster are enclosed in parentheses, while those without parentheses are considered independent indicators, as shown in Table 5 and Table 6.
According to the statistical results, a smaller clustering distance threshold leads to fewer merged clusters and more independent indicators, while a larger threshold results in more integrated clusters and fewer independent indicators. In this study, we consider the threshold value of 3 to be appropriate, as the resulting indicator sets are well suited for project evaluation requirements.
These results align with the engineering significance of oilfield attributes. Static parameters were divided into 12 categories: Reservoir Area, Reservoir Oil Viscosity, and Net Pay Thickness form one group, reflecting the fundamental storage space and flow characteristics of the reservoir. Original Oil in Place and Recoverable Oil Reserves form another group, highlighting the dominant role of reserve volume in oilfield assessment. Initial Gas–Oil Ratio and Oil Volume Factor form a group, indicating how fluid properties affect drive efficiency and sustain production. Reservoir Burial Depth and Initial Reservoir Pressure form a further group, demonstrating the critical role of reservoir pressure in maintaining production energy. Development parameters were divided into five categories: Well Pattern Density, Initial Production per Well, and Peak Production Rate form one group, characterizing the decisive influence of well spacing and initial output on production capacity. Composite Decline Rate and Oil Production Rate form another group, illustrating production decline behavior and long-term performance stability. The remaining indicators each form individual groups, underscoring their independent value in evaluation.
Principal Component Analysis Method
To evaluate the explanatory power of each factor and the contribution of variables, a scree plot was used to visualize the importance of factors. The number of factors was determined by finding the inflection point of the eigenvalue curve. According to this method, calculations were performed separately for the static parameters and development parameters, and the results were displayed using scree plots, as shown in Figure 8 and Figure 9.
In the scree plot, a smaller eigenvalue indicates a larger number of extracted factors, a higher cumulative contribution rate, and better retention of the original information. The number of factors and the corresponding cumulative variance explained for various eigenvalue thresholds are shown in Table 7 and Table 8.
In this study, the number of factors was determined based on a cumulative variance explained threshold of 90%. For the static parameters, nine factors were extracted, meaning that the 17 indicators were grouped into nine factor categories. For the development parameters, six factors were extracted, corresponding to six categories for the eight development indicators.
Based on the number of factors corresponding to the 90% cumulative contribution rate, the rotated factor loading matrices were calculated, as shown in Table 9 and Table 10.
Based on the results of the factor loading matrices, the classification of static and development parameters was summarized in Table 11.
These classification results align with the engineering significance of oilfield attributes. Static parameters were divided into nine categories: Reservoir Burial Depth, Initial Reservoir Pressure, and Initial Reservoir Temperature form one group, reflecting the subsurface pressure–temperature conditions that govern drive energy. Initial Gas–Oil Ratio, Oil Volume Factor, and Bubble Point Pressure form another group, indicating how fluid phase behavior and volumetric expansion potential control recovery efficiency. Reservoir Area, Original Oil in Place, and Recoverable Oil Reserves form a group, highlighting the volumetric capacity that underpins overall field value; Average Permeability, Reservoir Oil Viscosity, and Oil API Gravity group together, capturing the combined effects of reservoir permeability and fluid mobility on deliverability. Development parameters were divided into six categories: Well Pattern Density and Composite Water Cut form a group, characterizing development intensity and waterflooding performance. Recovery Factor and Oil Production Rate form another group, illustrating overall production capacity. The remaining indicators each form individual groups, underscoring their independent value in evaluation.
Correlation coefficient analysis, systematic clustering, and principal component analysis were used to classify both static and development parameters. The classification results from the three methods are summarized in Table 12 and Table 13. Indicators marked with the same color in the tables belong to the same category, while uncolored indicators are considered independent, meaning they are not grouped with any other indicators.
As shown in the tables above, the classification results of static parameters obtained by the three methods are generally consistent. For development parameters, the correlation coefficient method does not reduce the number of indicators, while the other two approaches yield alternative classification patterns. It is important to note that all three methods rely solely on statistical analysis and do not consider the geological or reservoir engineering significance of the indicators. Therefore, based on practical reservoir engineering needs and through expert discussion and validation, a comprehensive evaluation was performed to select the key indicators. In the end, twelve indicators were chosen from the seventeen static parameters and six from the eight development parameters. These eighteen quantitative indicators, combined with eight qualitative indicators, form a total of twenty-six key analogy indicators for oilfield development projects, as shown in Figure 10.

3.1.2. Results of Key Indicators Classification

Following the procedure outlined in Section 2.1.2, the probability distribution of each key quantitative indicator was calculated based on the grading criteria, and the results are summarized in Table 14. For the three parameters with existing industry-standard classification systems (porosity, permeability, and formation oil viscosity), we primarily used the established industry criteria. By systematically incorporating expert knowledge, the classification thresholds for all key quantitative indicators were optimized and calibrated, with the final classification standards shown in Table 15.

3.1.3. Results of Key Indicator Weighting

For the two categories of key quantitative indicators—static parameters and development parameters—we calculated objective weights using the entropy method and the coefficient of variation method. We assigned reasonable subjective weights based on expert judgment and then used the combined subjective–objective weighting formula to determine the comprehensive weight for each indicator. The results are shown in Table 16 and Table 17.

3.2. Analogy Method Optimization

Based on a systematic review of prior studies and algorithms, we selected five classical machine-learning methods—support vector machine (SVM), random forest (RF), backpropagation neural network (BP), k-nearest neighbors (KNN), and decision tree (DT)—for comparative analysis. These methods span linear and nonlinear, parametric and nonparametric, model-based and distance-based paradigms, enabling a comprehensive evaluation of the proposed analogy indicator system. Their interpretability and ease of implementation also satisfy the reproducibility and engineering requirements of oilfield development decision-making. Although recent work has applied next-generation ensemble methods such as extreme gradient boosting (XGBoost) and LightGBM to production forecasting [19], those algorithms typically demand large sample sizes and extensive hyperparameter tuning, which risks overfitting under our sample conditions. In contrast, the five selected classical algorithms achieve robust performance without complex tuning, enhancing the generalizability of our findings and providing a valuable reference for future research.

3.2.1. Results of Single Indicator Analogy

A dataset of 663 onshore medium-to-high permeability sandstone oilfields was used for single indicator analogy. Machine-learning models were trained on 80% of the samples and tested on the remaining 20%. Test-set fitting accuracy was calculated to assess each method’s predictive performance. Table 18 shows that SVM and RF achieved accuracies of 85.81% and 74.1%, respectively, both meeting engineering requirements. The BP neural network, KNN, and DT showed lower accuracies.
Figure 11 presents predicted versus actual recovery values for each method. Green points denote training-data predictions, and red points denote test-data predictions. In the SVM plot, both green and red points cluster tightly along the 45° line, indicating an excellent fit on the training data and strong generalization to unseen cases. RF also aligns densely around the diagonal, although test-data points show slightly greater scatter, indicating robust performance with limited variance. By contrast, the BP neural network achieves a near-perfect fit on the training data but exhibits marked dispersion on the test data, signaling mild overfitting. KNN yields a broadly dispersed point cloud for both sets, revealing underfitting due to its local-averaging nature. Finally, DT displays a staircase pattern in the training data, characteristic of overfitting, and notable scatter in the test data, reflecting poor generalization. These observations confirm that only support vector machine and random forest achieve an optimal balance between bias and variance that meets engineering requirements.
Therefore, support vector machine and random forest were selected for recovery factor prediction.
A total of 157 onshore low-permeability sandstone oilfields were analyzed similarly, with 80% of samples used for training and 20% for testing. Test set accuracy was calculated to evaluate the predictive performance of each method. As shown in Table 19, the small sample size resulted in poor outcomes across all five machine-learning methods, with the highest accuracy below 70%. This indicates that when the sample size is small (n < 200), the ability of conventional machine-learning algorithms to generalize is severely limited, making it difficult to attain high-precision predictions in oilfield development.
Figure 12, which presents the low-permeability dataset, shows that the training-data points follow similar trends as in the medium-high permeability case, while the test-data points are substantially more dispersed, indicating markedly poorer generalization when the sample size is small.

3.2.2. Results of Whole Asset Analogy

SVM, RF, BP neural network, KNN, and DT methods were used to perform whole asset analogy on 663 onshore medium-to-high permeability sandstone oilfields. The dataset comprises five asset levels with approximately equal numbers of oilfields in each level, indicating a balanced multi-class dataset. Twenty oilfields were set aside for testing, and the predicted asset categories were compared with the conventional empirical classification categories. The results are shown in Table 20.
DT achieved an accuracy of 25%, KNN 55%, and BP neural network 55%, indicating unsatisfactory performance. RF reached 70% accuracy, while SVM achieved 95%, both of which meet engineering requirements. SVM’s superior performance reflects its ability, through kernel-based high-dimensional mappings, to capture the complex nonlinear relationships among oilfield parameters even with limited training samples, making it particularly suitable for asset-level classification of medium-to-high permeability sandstone oilfields. With 95% accuracy, SVM can effectively replace traditional empirical classification methods, improving the objectivity and efficiency of oilfield evaluation.

4. Conclusions

This study addresses the need for quick evaluation of overseas oilfield development projects facing complex processes, limited data, and scarce experience by developing a systematic analogy indicator system and testing related machine-learning-based analogy methods.
A database of 1436 oilfields was compiled, and 36 original indicators were systematically screened using correlation coefficient analysis, systematic clustering, and principal component analysis. The selected indicators were then classified based on probability distribution curves and expert judgment and weighted with a combined approach that integrates expert scoring with entropy and coefficient of variation methods. The outcome is a set of 26 key analogy indicators covering reservoir properties, trap and structural characteristics, fluid properties, reserve parameters, and development parameters. This indicator system balances static and dynamic features, as well as subjective and objective information, providing a strong foundation for further research using the analogy method.
Five machine-learning algorithms—support vector machine (SVM), random forest (RF), backpropagation neural network (BP), k-nearest neighbors (KNN), and decision tree (DT)—were used for both single-indicator and whole-asset analogy tasks on actual oilfield data. For the 663 onshore medium-to-high permeability sandstone samples, SVM and RF achieved accuracies above 70%, meeting engineering requirements. SVM performed the best, with 86% accuracy in recovery factor prediction and 95% accuracy in whole-asset classification. Conversely, for the 157 low-permeability sandstone samples, all methods scored below 70%, showing that traditional machine-learning algorithms are not practical when the sample size is small.
The developed methodology provides a standardized, programmatic workflow for the quick evaluation of new overseas oilfield development projects. Using the key analogy indicator system along with the SVM-based analogy method enables rapid screening of multiple candidate fields, boosting evaluation efficiency and reducing investment risks caused by inconsistent judgments or limited experience. Future work could involve iterative refinement and reweighting of the indicator system as development technologies progress, as well as incorporating and comparing additional advanced algorithms to further enhance the accuracy and reliability of quick project screening and assessment.

Author Contributions

Conceptualization, M.Z. and B.Z.; methodology, M.Z., Z.L. and C.Y.; software, M.Z. and F.H.; investigation, T.Q., B.W. and L.F.; data curation, Z.L.; writing—original draft preparation, M.Z.; writing—review and editing, Z.L.; supervision, Z.L.; project administration, Z.L.; funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Major Science and Technology Project of China National Petroleum Corporation, “Research on Integrated Technical, Economic, and Commercial Evaluation Technologies for Oil and Gas Exploration Assets in a Low-Carbon Context” (Grant No. 2023ZZ07-05).

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy.

Conflicts of Interest

Authors Muzhen Zhang, Zhanxiang Lei, Baoquan Zeng, Fei Huang, Tailai Qu, Bin Wang and Li Fu were employed by the company Research Institute of Petroleum Exploration & Development, PetroChina. Author Chengyun Yan was employed by the company The First Natural Gas Plant of PetroChina Qinghai Oilfield Company. All the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Dong, W.; Jiao, J.; Xie, S.; Lyu, C.; Cui, G.; Meng, J. Cumulative production curve method for the quantitative evaluation on the effect of oilfield development measures: A case study of the nitrogen injection pilot in Yanling oilfield, Bohai Bay Basin. Pet. Explor. Dev. 2016, 43, 672–678. [Google Scholar] [CrossRef]
  2. Ponomarenko, T.; Marin, E.; Galevskiy, S. Economic Evaluation of Oil and Gas Projects: Justification of Engineering Solutions in the Implementation of Field Development Projects. Energies 2022, 15, 3103. [Google Scholar] [CrossRef]
  3. Ponomarenko, T.V.; Sergeev, I.B. Valuation of mineral assets of a mining company on the basis of the option approach. J. Min. Inst. 2011, 191, 164–175. [Google Scholar]
  4. Mu, L.X.; Fan, Z.F.; Xu, A.Z. Development characteristics, models and strategies for overseas oil and gas fields. Pet. Explor. Dev. 2018, 45, 735–744. [Google Scholar] [CrossRef]
  5. Li, Z.X.; Liu, J.Y.; Luo, D.K.; Wang, J.J. Study of evaluation method for the overseas oil and gas investment based on risk compensation. Pet. Sci. 2020, 17, 858–871. [Google Scholar] [CrossRef]
  6. Yusgiantoro, P.; Hsiao, F.S.T. Production-sharing contracts and decision-making in oil production: The case of Indonesia. Energy Econ. 1993, 15, 245–256. [Google Scholar] [CrossRef]
  7. Sidle, R.E.E.; Lee, W.J.J. An Update on the Use of Reservoir Analogs for the Estimation of Oil and Gas Reserves. SPE Econ. Manag. 2010, 2, 80–85. [Google Scholar] [CrossRef]
  8. Liu, Z.L.; Geng, M.; Zhang, Y.Z. The Method for Selecting Analogous Reservoirs Based on SEC. J. Phys. Conf. Ser. 2023, 2520, 012008. [Google Scholar] [CrossRef]
  9. Martín Rodríguez, H.; Escobar, E.; Embid, S.; Rodríguez Morillas, N.; Hegazy, M.; Lake, L.W. New Approach to Identify Analogous Reservoirs. SPE Econ. Manag. 2014, 6, 173–184. [Google Scholar] [CrossRef]
  10. El-Nikhely, A.; El-Gendy, N.H.; Bakr, A.M.; Zawra, M.S.; Ondrak, R.; Barakat, M.K. Decoding of seismic data for complex stratigraphic traps revealing by seismic attributes analogy in Yidma/Alamein concession area Western Desert, Egypt. J. Petrol. Explor. Prod. Technol. 2022, 12, 3325–3338. [Google Scholar] [CrossRef]
  11. Liu, X.H.; Hu, T.; Pang, X.Q.; Xu, Z.; Wang, T.; Zhang, X.W.; Wang, E.Z.; Wu, Z.Y. Evaluation of natural gas hydrate resources in the South China Sea using a new genetic analogy method. Pet. Sci. 2022, 19, 48–57. [Google Scholar] [CrossRef]
  12. Awoleke, O.O.; Lane, R.H. Analysis of data from the Barnett Shale using conventional statistical and virtual intelligence techniques. SPE Res. Eval. Eng. 2011, 14, 544–556. [Google Scholar] [CrossRef]
  13. Iraji, S.; Soltanmohammadi, R.; Matheus, G.F.; Basso, M.; Vidal, A.C. Application of unsupervised learning and deep learning for rock type prediction and petrophysical characterization using multi-scale data. Geoenergy Sci. Eng. 2023, 230, 212241. [Google Scholar] [CrossRef]
  14. Wang, Y.; Cheng, S.Q.; Zhang, F.B.; Feng, N.C.; Li, L.; Shen, X.Z.; Li, J.H.; Yu, H. Big data technique in the reservoir parameters’ prediction and productivity evaluation: A field case in western South China Sea. Gondwana Res. 2021, 96, 22–36. [Google Scholar] [CrossRef]
  15. Werneck, R.d.O.; Prates, R.; Moura, R.; Gonçalves, M.M.; Castro, M.; Soriano-Vargas, A.; Mendes Júnior, P.R.; Hossain, M.M.; Zampieri, M.F.; Ferreira, A.F.; et al. Data-driven deep-learning forecasting for oil production and pressure. J. Pet. Sci. Eng. 2022, 210, 109937. [Google Scholar] [CrossRef]
  16. Zhou, Q.; Dilmore, R.; Kleit, A.; Wang, J.Y. Evaluating gas production performances in Marcellus using data mining technologies. J. Nat. Gas Sci. Eng. 2014, 20, 109–120. [Google Scholar] [CrossRef]
  17. Yuan, Z.H.; Qin, W.Z.; Zhao, J.S. Smart Manufacturing for the Oil Refining and Petrochemical Industry. Engineering 2017, 3, 179–182. [Google Scholar] [CrossRef]
  18. Zhang, M.Z.; Jia, A.L.; Lei, Z.X. Inter-well reservoir parameter prediction based on LSTM-Attention network and sedimentary microfacies. Geoenergy Sci. Eng. 2024, 235, 212723. [Google Scholar] [CrossRef]
  19. Bai, W.P.; Cheng, S.Q.; Guo, X.Y.; Wang, Y.; Guo, Q.; Tan, C.D. Oilfield analogy and productivity prediction based on machine learning: Field cases in PL oilfield, China. Pet. Sci. 2024, 21, 2554–2570. [Google Scholar] [CrossRef]
  20. Guo, Q.; Cheng, S.Q.; Zeng, F.H.; Wang, Y.; Lu, C.; Tan, C.D.; Li, G.L. Reservoir permeability prediction based on analogy and machine learning methods: Field cases in DLG Block of Jing’an Oilfield, China. Lithosphere 2022, 2022, 5249460. [Google Scholar] [CrossRef]
  21. Mahdaviara, M.; Sharifi, M.; Ahmadi, M. Toward evaluation and screening of the enhanced oil recovery scenarios for low permeability reservoirs using statistical and machine learning techniques. Fuel 2022, 325, 124795. [Google Scholar] [CrossRef]
  22. Rahimi, M.; Riahi, M.A. Reservoir facies classification based on random forest and geostatistics methods in an offshore oilfield. J. Appl. Geophys. 2022, 201, 104640. [Google Scholar] [CrossRef]
  23. Zhang, M.Z.; Jia, A.L.; Lei, Z.X.; Lei, G. A comprehensive asset evaluation method for oil and gas projects. Processes 2023, 11, 2398. [Google Scholar] [CrossRef]
  24. Kassem, M.A.; Khoiry, M.A.; Hamzah, N. Using Relative Importance Index Method for Developing Risk Map in Oil and Gas Construction Projects. J. Kejuruter. 2020, 32, 441–453. [Google Scholar] [CrossRef]
  25. Bi, A.; Huang, S.; Sun, X. Risk Assessment of Oil and Gas Pipeline Based on Vague Set-Weighted Set Pair Analysis Method. Mathematics 2023, 11, 349. [Google Scholar] [CrossRef]
  26. Ni, S.; Tang, Y.; Wang, G.; Yang, L.; Lei, B.; Zhang, Z. Risk identification and quantitative assessment method of offshore platform equipment. Energy Rep. 2022, 8, 7219–7229. [Google Scholar] [CrossRef]
  27. Rui, Z.; Lu, J.; Zhang, Z.; Guo, R.; Ling, K.; Zhang, R.; Patil, S. A quantitative oil and gas reservoir evaluation system for development. J. Nat. Gas Sci. Eng. 2017, 42, 31–39. [Google Scholar] [CrossRef]
  28. Vilela, M.; Oluyemi, G.; Petrovski, A. A fuzzy inference system applied to value of information assessment for oil and gas industry. Decis. Mak. Appl. Manag. Eng. 2019, 2, 1–18. [Google Scholar] [CrossRef]
  29. Ilyushin, Y.; Nosova, V.; Krauze, A. Application of Systems Analysis Methods to Construct a Virtual Model of the Field. Energies 2025, 18, 1012. [Google Scholar] [CrossRef]
  30. Li, M.; Qu, Z.; Wang, M.; Ran, W. The Influence of Micro-Heterogeneity on Water Injection Development in Low-Permeability Sandstone Oil Reservoirs. Minerals 2023, 13, 1533. [Google Scholar] [CrossRef]
Figure 1. Indicator analogy process for oilfield development projects.
Figure 1. Indicator analogy process for oilfield development projects.
Energies 18 04076 g001
Figure 2. Basic analogy indicators for oilfield development projects. Qualitative indicators are highlighted in blue, and quantitative indicators are shown in black.
Figure 2. Basic analogy indicators for oilfield development projects. Qualitative indicators are highlighted in blue, and quantitative indicators are shown in black.
Energies 18 04076 g002
Figure 3. Flowchart for the construction of the key analogy indicator system in oilfield development projects.
Figure 3. Flowchart for the construction of the key analogy indicator system in oilfield development projects.
Energies 18 04076 g003
Figure 4. Flowchart for machine-learning-based optimization of analogy methods for oilfield development projects.
Figure 4. Flowchart for machine-learning-based optimization of analogy methods for oilfield development projects.
Energies 18 04076 g004
Figure 5. Histogram of composite analogy scores for 663 medium-to-high permeability sandstone oilfields.
Figure 5. Histogram of composite analogy scores for 663 medium-to-high permeability sandstone oilfields.
Energies 18 04076 g005
Figure 6. Hierarchical clustering dendrogram of static parameters.
Figure 6. Hierarchical clustering dendrogram of static parameters.
Energies 18 04076 g006
Figure 7. Hierarchical clustering dendrogram of development parameters.
Figure 7. Hierarchical clustering dendrogram of development parameters.
Energies 18 04076 g007
Figure 8. Scree plot of static parameters.
Figure 8. Scree plot of static parameters.
Energies 18 04076 g008
Figure 9. Scree plot of development parameters.
Figure 9. Scree plot of development parameters.
Energies 18 04076 g009
Figure 10. Key analogy indicators for oilfield development projects. Qualitative indicators are highlighted in blue, and quantitative indicators are shown in black.
Figure 10. Key analogy indicators for oilfield development projects. Qualitative indicators are highlighted in blue, and quantitative indicators are shown in black.
Energies 18 04076 g010
Figure 11. Comparison of predicted versus actual recovery values for medium-high permeability sandstone oilfields computed by five machine-learning methods.
Figure 11. Comparison of predicted versus actual recovery values for medium-high permeability sandstone oilfields computed by five machine-learning methods.
Energies 18 04076 g011
Figure 12. Comparison of predicted versus actual recovery values for low-permeability sandstone oilfields computed by five machine-learning methods.
Figure 12. Comparison of predicted versus actual recovery values for low-permeability sandstone oilfields computed by five machine-learning methods.
Energies 18 04076 g012
Table 1. Comparison of machine-learning methods.
Table 1. Comparison of machine-learning methods.
MethodAdvantagesDisadvantagesApplicable ScenariosAlgorithm Principle
Support Vector Machine (SVM)Performs well on high-dimensional, nonlinearly separable data; good generalization.Lacks interpretability; decision process not intuitive.Suitable for relatively small samples with high feature dimensionality.Maps inputs via kernel functions into a high-dimensional space and finds the hyperplane that maximizes class margin.
Random Forest (RF)Strong noise robustness; resistant to overfitting.High memory footprint; computationally expensive for training and inference.Works well on high-dimensional or very large datasets.Ensembles multiple decision trees trained on bootstrapped samples and random feature subsets; predictions by majority vote or averaging.
Backpropagation Neural Network (BP)Flexible architecture; capable of modeling complex nonlinear mappings.Black-box model; internal workings hard to interpret.Well suited for large datasets and complex nonlinear tasks.Multi-layer feedforward network trained by backpropagation using gradient descent to minimize prediction error.
K-Nearest Neighbors (KNN)Simple to implement; intuitive prediction by neighbors.High storage and computation cost, especially for large datasets.Suitable for low-dimensional problems with modest sample sizes.Instance-based learning: finds k-nearest neighbors by distance in feature space; predicts by majority vote or averaging.
Decision Tree (DT)Easy to understand; highly interpretable.Prone to overfitting; may generalize poorly.Suits problems with few classes and clear feature semantics.Recursively partitions the feature space by selecting splits that maximize information gain or minimize Gini impurity.
Table 2. Correlation matrix of static parameters. Correlation coefficients greater than or equal to 0.6 are highlighted in bold to indicate relatively strong associations.
Table 2. Correlation matrix of static parameters. Correlation coefficients greater than or equal to 0.6 are highlighted in bold to indicate relatively strong associations.
ParameterNet Pay ThicknessAverage Net-to-Gross RatioAverage PorosityAverage PermeabilityOil SaturationReservoir AreaReservoir Burial DepthInitial Reservoir PressureInitial Reservoir TemperatureOil API GravityReservoir Oil ViscosityInitial Gas–Oil RatioOil Volume FactorBubble Point PressureOriginal Oil in PlaceRecoverable Oil ReservesReserves Abundance
Net Pay Thickness1.00 0.21 −0.16 −0.13 0.08 −0.10 0.29 0.47 0.27 0.05 −0.08 0.23 0.24 0.50 0.01 0.05 0.34
Average Net-to-Gross Ratio0.21 1.00 0.11 0.25 0.21 0.03 0.01 0.06 0.13 −0.12 0.27 −0.02 0.00 0.14 0.18 0.06 0.24
Average Porosity−0.16 0.11 1.00 0.51 −0.03 0.00 −0.36 −0.44 −0.44 −0.50 0.29 −0.50 −0.53 −0.36 0.16 0.19 0.28
Average Permeability−0.13 0.25 0.51 1.00 0.19 0.14 −0.23 −0.25 −0.23 −0.54 0.45 −0.29 −0.31 −0.22 0.17 0.15 −0.05
Oil Saturation0.08 0.21 −0.03 0.19 1.00 0.15 −0.02 0.13 0.05 0.01 0.05 0.14 0.15 0.17 0.16 0.17 −0.01
Reservoir Area−0.10 0.03 0.00 0.14 0.15 1.00 0.01 0.08 −0.07 −0.11 0.11 −0.10 −0.08 −0.06 0.800.72−0.06
Reservoir Burial Depth0.29 0.01 −0.36 −0.23 −0.02 0.01 1.00 0.790.820.16 −0.15 0.38 0.46 0.610.10 0.01 −0.10
Initial Reservoir Pressure0.47 0.06 −0.44 −0.25 0.13 0.08 0.79 1.00 0.730.22 −0.17 0.37 0.44 0.59 0.12 0.05 0.04
Initial Reservoir Temperature0.27 0.13 −0.44 −0.23 0.05 −0.07 0.82 0.73 1.00 0.30 −0.18 0.48 0.58 0.58 0.02 −0.07 −0.11
Oil API Gravity0.05 −0.12 −0.60 −0.54 0.01 −0.11 0.16 0.22 0.30 1.00 −0.54 0.49 0.55 0.26 −0.28 −0.16 −0.31
Reservoir Oil Viscosity−0.08 0.27 −0.29 0.71 0.05 0.11 −0.15 −0.17 −0.18 −0.54 1.00 −0.22 −0.24 −0.24 0.20 0.00 0.05
Initial Gas–Oil Ratio0.23 −0.02 −0.50 −0.29 0.14 −0.10 0.38 0.37 0.48 0.49 −0.22 1.00 0.930.73−0.12 −0.11 −0.11
Oil Volume Factor0.24 0.00 −0.53 −0.31 0.15 −0.08 0.46 0.44 0.58 0.55 −0.24 0.93 1.00 0.70−0.10 −0.11 −0.12
Bubble Point Pressure0.50 0.14 −0.36 −0.22 0.17 −0.06 0.61 0.59 0.58 0.26 −0.24 0.73 0.70 1.00 −0.03 −0.02 0.02
Original Oil in Place0.01 0.18 0.16 0.17 0.16 0.80 0.10 0.12 0.02 −0.28 0.20 −0.12 −0.10 −0.03 1.00 0.860.16
Recoverable Oil Reserves0.05 0.06 0.19 0.15 0.17 0.72 0.01 0.05 −0.07 −0.16 0.00 −0.11 −0.11 −0.02 0.86 1.00 0.14
Reserves Abundance0.34 0.24 0.28 −0.05 −0.01 −0.06 −0.10 0.04 −0.11 −0.31 0.05 −0.11 −0.12 0.02 0.16 0.14 1.00
Table 3. Correlation matrix of development parameters.
Table 3. Correlation matrix of development parameters.
ParameterWell Pattern DensityInitial Production per WellPeak Production RateComposite Water CutComposite Decline RateRecovery Efficiency of ReservesOil Production RateRecovery Factor
Well Pattern Density1.00 0.47 0.08 −0.46 0.18 0.00 0.47 0.08
Initial Production per Well0.47 1.00 0.10 −0.26 0.13 −0.08 0.31 0.08
Peak Production Rate0.08 0.10 1.00 0.12 −0.10 0.04 −0.06 0.09
Composite Water Cut−0.46 −0.26 0.12 1.00 −0.10 0.21 −0.19 0.21
Composite Decline Rate0.18 0.13 −0.10 −0.10 1.00 0.09 0.20 −0.17
Recovery Efficiency of Reserves0.00 −0.08 0.04 0.21 0.09 1.00 0.26 0.20
Oil Production Rate0.47 0.31 −0.06 −0.19 0.20 0.26 1.00 0.36
Recovery Factor0.08 0.08 0.09 0.21 −0.17 0.20 0.36 1.00
Table 4. Classification statistics of static and development parameters based on the correlation coefficient method.
Table 4. Classification statistics of static and development parameters based on the correlation coefficient method.
Correlation Coefficient ThresholdStatic Parameters
(11 Categories)
Development Parameters
(8 Categories)
0.6
1.
Reservoir Burial Depth, Initial Reservoir Pressure, Initial Reservoir Temperature;
2.
Initial Gas–Oil Ratio, Oil Volume Factor, Bubble Point Pressure;
3.
Reservoir Area, Original Oil in Place, Recoverable Oil Reserves;
4.
Average Permeability;
5.
Reservoir Oil Viscosity;
6.
Reserves Abundance;
7.
Net Pay Thickness;
8.
Average Net-to-Gross Ratio;
9.
Average Porosity;
10.
Oil Saturation;
11.
Oil API Gravity.
  • Well Pattern Density;
  • Initial Production per Well;
  • Peak Production Rate;
  • Composite Water Cut;
  • Composite Decline Rate;
  • Recovery Efficiency of Reserves;
  • Oil Production Rate;
  • Recovery Factor.
Table 5. Classification of static parameters under different clustering distance thresholds. Grouped parameters are indicated within parentheses, while ungrouped entries represent individual indicators.
Table 5. Classification of static parameters under different clustering distance thresholds. Grouped parameters are indicated within parentheses, while ungrouped entries represent individual indicators.
Clustering Distance ThresholdNumber of CategoriesStatic Parameter Groups
58(Reservoir Area, Reservoir Oil Viscosity, Net Pay Thickness, Average Permeability, Reserves Abundance);
(Original Oil in Place, Recoverable Oil Reserves);
(Initial Gas–Oil Ratio, Oil Volume Factor, Reservoir Burial Depth, Initial Reservoir Pressure, Bubble Point Pressure);
Initial Reservoir Temperature; Oil API Gravity; Average Porosity; Oil Saturation; Average Net-to-Gross Ratio.
410(Reservoir Area, Reservoir Oil Viscosity, Net Pay Thickness, Average Permeability);
(Original Oil in Place, Recoverable Oil Reserves);
(Initial Gas–Oil Ratio, Oil Volume Factor);
(Reservoir Burial Depth, Initial Reservoir Pressure, Bubble Point Pressure);
Reserves Abundance; Initial Reservoir Temperature; Oil API Gravity; Average Porosity; Oil Saturation; Average Net-to-Gross Ratio.
312(Reservoir Area, Reservoir Oil Viscosity, Net Pay Thickness);
(Original Oil in Place, Recoverable Oil Reserves);
(Initial Gas–Oil Ratio, Oil Volume Factor);
(Reservoir Burial Depth, Initial Reservoir Pressure);
Average Permeability; Reserves Abundance; Bubble Point Pressure; Initial Reservoir Temperature; Oil API Gravity; Average Porosity; Oil Saturation; Average Net-to-Gross Ratio.
215(Reservoir Area, Reservoir Oil Viscosity);
(Initial Gas–Oil Ratio, Oil Volume Factor);
Net Pay Thickness; Average Permeability; Reserves Abundance; Original Oil in Place; Recoverable Oil Reserves; Bubble Point Pressure; Reservoir Burial Depth; Initial Reservoir Pressure; Initial Reservoir Temperature; Oil API Gravity; Average Porosity; Oil Saturation; Average Net-to-Gross Ratio.
Table 6. Classification of development parameters under different clustering distance thresholds. Grouped parameters are indicated within parentheses, while ungrouped entries represent individual indicators.
Table 6. Classification of development parameters under different clustering distance thresholds. Grouped parameters are indicated within parentheses, while ungrouped entries represent individual indicators.
Clustering Distance ThresholdNumber of CategoriesDevelopment Parameter Groups
53(Well Pattern Density, Initial Production per Well, Peak Production Rate, Composite Decline Rate, Oil Production Rate);
(Composite Water Cut, Recovery Efficiency of Reserves);
Recovery Factor.
44(Well Pattern Density, Initial Production per Well, Peak Production Rate, Composite Decline Rate, Oil Production Rate);
Composite Water Cut; Recovery Efficiency of Reserves; Recovery Factor.
35(Well Pattern Density, Initial Production per Well, Peak Production Rate);
(Composite Decline Rate, Oil Production Rate);
Composite Water Cut; Recovery Efficiency of Reserves; Recovery Factor.
26(Well Pattern Density, Initial Production per Well);
(Composite Decline Rate, Oil Production Rate);
Peak Production Rate; Composite Water Cut; Recovery Efficiency of Reserves; Recovery Factor.
Table 7. Explained variance of factors for static parameters.
Table 7. Explained variance of factors for static parameters.
Factor NumberEigenvalueExplained Variance (%)Cumulative Explained Variance (%)
15.193 30.208 30.208
22.935 17.076 47.284
31.876 10.916 58.200
41.578 9.181 67.381
51.267 7.369 74.750
61.005 5.845 80.596
70.816 4.744 85.340
80.733 4.265 89.605
90.566 3.292 92.897
100.344 2.003 94.899
110.262 1.527 96.426
120.205 1.193 97.619
130.158 0.919 98.538
140.122 0.709 99.247
150.078 0.451 99.698
160.052 0.302 100
170 0 100
Table 8. Explained variance of factors for development parameters.
Table 8. Explained variance of factors for development parameters.
Factor NumberEigenvalueExplained Variance (%)Cumulative Explained Variance (%)
12.197 27.466 27.466
21.572 19.646 47.112
31.173 14.658 61.769
40.962 12.027 73.796
50.723 9.037 82.833
60.589 7.363 90.197
70.405 5.058 95.255
80.380 4.745 100
Table 9. Static parameter rotated factor loadings. The bold values represent the loading with the greatest absolute magnitude for each parameter, indicating its primary associated factor.
Table 9. Static parameter rotated factor loadings. The bold values represent the loading with the greatest absolute magnitude for each parameter, indicating its primary associated factor.
ParameterFactor 1Factor 2Factor 3Factor 4Factor 5Factor 6Factor 7Factor 8Factor 9
Initial Gas–Oil Ratio−0.9310.065 −0.115 0.027 0.058 −0.194 −0.179 −0.024 −0.055
Oil Volume Factor−0.8810.053 −0.152 0.046 0.070 −0.236 −0.274 0.020 −0.027
Bubble Point Pressure−0.6880.017 −0.094 −0.016 0.082 0.003 −0.464 0.057 −0.375
Original Oil in Place0.052 −0.9370.130 −0.131 0.039 0.073 −0.104 0.093 0.038
Recoverable Oil Reserves0.027 −0.921−0.035 −0.056 0.070 0.133 0.028 −0.003 −0.081
Reservoir Area0.052 −0.9090.067 0.119 0.059 −0.074 0.010 −0.021 0.054
Average Permeability0.083 −0.091 0.7460.218 0.146 0.506 0.141 0.154 0.007
Reservoir Oil Viscosity0.126 −0.056 0.983−0.043 0.003 −0.224 0.096 0.124 0.041
Oil API Gravity−0.343 0.118 −0.5960.330 0.025 −0.533 −0.001 0.089 −0.013
Reserves Abundance0.044 −0.055 0.003 −0.936−0.011 0.122 0.057 0.132 −0.189
Oil Saturation−0.104 −0.113 0.051 0.011 0.9790.008 −0.013 0.094 −0.029
Average Porosity0.294 −0.075 −0.097 −0.127 −0.003 0.9310.251 0.071 0.060
Reservoir Burial Depth−0.216 −0.045 −0.052 0.048 −0.068 −0.064 −0.924−0.035 −0.080
Initial Reservoir Temperature−0.316 0.042 −0.100 0.065 −0.001 −0.146 −0.8510.154 0.006
Initial Reservoir Pressure−0.144 −0.090 −0.097 −0.046 0.121 −0.213 −0.827−0.017 −0.306
Average Net-to-Gross Ratio−0.013 −0.052 0.166 −0.119 0.102 0.064 −0.054 0.955−0.096
Net Pay Thickness−0.135 0.015 −0.037 −0.196 0.027 −0.056 −0.211 0.105 −0.921
Table 10. Development parameter rotated factor loadings. The bold values represent the loading with the greatest absolute magnitude for each parameter, indicating its primary associated factor.
Table 10. Development parameter rotated factor loadings. The bold values represent the loading with the greatest absolute magnitude for each parameter, indicating its primary associated factor.
ParameterFactor 1Factor 2Factor 3Factor 4Factor 5Factor 6
Well Pattern Density−0.711−0.214 0.158 0.143 0.053 −0.406
Composite Water Cut0.884−0.173 0.006 0.115 0.160 0.081
Recovery Factor0.144 −0.914−0.158 0.064 0.057 −0.032
Oil Production Rate−0.418 −0.6320.253 −0.103 0.298 −0.218
Composite Decline Rate−0.049 0.066 0.980−0.051 0.042 −0.062
Peak Production Rate0.032 −0.022 −0.050 0.9890.020 −0.050
Recovery Efficiency of Reserves0.080 −0.098 0.041 0.023 0.9790.044
Initial Production per Well−0.156 −0.062 0.058 0.047 −0.046 −0.972
Table 11. Classification statistics of static and development parameters based on the principal component analysis method.
Table 11. Classification statistics of static and development parameters based on the principal component analysis method.
Eigenvalue ThresholdStatic Parameters
(9 Categories)
Eigenvalue ThresholdDevelopment Parameters
(6 Categories)
0.566
  • Reservoir Burial Depth, Initial Reservoir Pressure, Initial Reservoir Temperature;
  • Initial Gas–Oil Ratio, Oil Volume Factor, Bubble Point Pressure;
  • Reservoir Area, Original Oil in Place, Recoverable Oil Reserves;
  • Average Permeability, Reservoir Oil Viscosity, Oil API Gravity;
  • Reserves Abundance;
  • Net Pay Thickness;
  • Average Net-to-Gross Ratio;
  • Average Porosity;
  • Oil Saturation.
0.589
  • Well Pattern Density, Composite Water Cut;
  • Recovery Factor, Oil Production Rate;
  • Composite Decline Rate;
  • Peak Production Rate;
  • Recovery Efficiency of Reserves;
  • Initial Production per Well.
Table 12. Summary of static parameter indicator classification results by three screening methods. Indicators marked in the same color belong to the same category; those left unshaded are treated as individual indicators.
Table 12. Summary of static parameter indicator classification results by three screening methods. Indicators marked in the same color belong to the same category; those left unshaded are treated as individual indicators.
MethodNumber of CategoriesReservoir PropertiesTrap and Structural CharacteristicsFluid PropertiesReserve Parameters
Net Pay ThicknessAverage Net-to-Gross RatioAverage PorosityAverage PermeabilityOil SaturationReservoir AreaReservoir Burial DepthInitial Reservoir PressureInitial Reservoir TemperatureOil API GravityReservoir Oil ViscosityInitial Gas–Oil RatioBubble Point PressureOil Volume FactorOriginal Oil in PlaceRecoverable Oil ReservesReserves Abundance
Correlation Analysis11
Systematic Clustering12
Principal Component Analysis9
Table 13. Summary of development parameter indicator classification results by three screening methods. Indicators marked in the same color belong to the same category; those left unshaded are treated as individual indicators.
Table 13. Summary of development parameter indicator classification results by three screening methods. Indicators marked in the same color belong to the same category; those left unshaded are treated as individual indicators.
MethodNumber of CategoriesDevelopment Parameters
Well Pattern DensityInitial Production per WellPeak Production RateComposite Water CutComposite Decline RateRecovery Efficiency of ReservesOil Production RateRecovery Factor
Correlation Analysis8
Systematic Clustering5
Principal Component Analysis6
Table 14. Threshold values for classifying indicators by distribution types.
Table 14. Threshold values for classifying indicators by distribution types.
IndicatorUnitμ − σμ − σ/2μ + σ/2μ + σDistribution Type
Average Porosity%13.31725.9430Normal distribution
Reservoir Burial Depthm1250170029003500
Initial Reservoir PressureMPa9.8216.327.634.3
Oil API Gravity°25303841.98
Original Oil in PlaceMMbbl50.9121.56911650
Recovery Factor%11.98193039.84
IndicatorUnit15%30%70%85%Distribution Type
Net Pay Thicknessm7.514.15085Exponential distribution
Average PermeabilitymD29.71101152.672187
Oil Volume Factor-1.081.131.341.5
Bubble Point PressureMPa4.678.9219.127.5
Reserves AbundanceMMbbl1.43.8317.1733.33
Well Pattern DensityMMbbl/km20.120.271.383.04
Initial Production per Wellkm2/well3.5610.5185.63269.27
Peak Production Ratebbl/d1.65.1534.4484.6
Oil Production Rate%0.61.243.625.71
IndicatorUnit20%40%60%80%Distribution Type
Average Net-to-Gross -0.20.40.60.8Uniform distribution
Recovery Efficiency of Reserves%37.3470.2100100
Table 15. Classification standards for key quantitative indicators.
Table 15. Classification standards for key quantitative indicators.
Indicator NameUnitLevel 1Level 2Level 3Level 4Level 5Distribution Type
Net Pay Thicknessm≥8050–8015–505–15<5Exponential distribution
Average Net-to-Gross Ratio-≥0.80.6–0.80.4–0.60.2–0.4<0.2Uniform distribution
Average Porosity%≥3025–3015–2510–15<10Industry Standard
Average PermeabilitymD≥2000500–200050–50010–50<10Industry Standard
Reservoir Burial Depthm<10001000–15001500–30003000–3500≥3500Normal distribution
Initial Reservoir PressureMPa<1010–2020–3030–40≥40Normal distribution
Oil API Gravity°≥4338–4330–3820–30<20Normal distribution
Reservoir Oil ViscositymPa·s<11–55–2020–50≥50Industry Standard
Oil Volume Factor-≥1.51.34–1.51.13–1.341.08–1.13<1.08Exponential distribution
Bubble Point PressureMPa<55–1010–2020–30≥30Exponential distribution
Original Oil in PlaceMMbbl≥1600700–1600120–70050–120<50Log-normal distribution
Reserves AbundanceMMbbl/km2≥3216–324–162–4<2Exponential distribution
Well Pattern Densitykm2/well≥1.20.6–1.20.25–0.60.1–0.25<0.1Exponential distribution
Initial Production per Wellbbl/d≥250100–25050–10010–50<10Exponential distribution
Peak Production Ratekb/d≥8535–855–352–5<2Exponential distribution
Recovery Efficiency of Reserves%≥8060–8040–6020–40<20Exponential distribution
Oil Production Rate%≥53.5–51.5–3.51–1.5<1Exponential distribution
Recovery Factor%≥4030–4020–3010–20<10Normal distribution
Table 16. Key indicator weights for static parameters.
Table 16. Key indicator weights for static parameters.
Weighting MethodStatic Parameters
Net Pay ThicknessAverage Net-to-Gross RatioAverage PorosityAverage PermeabilityReservoir Burial DepthInitial Reservoir PressureOil API GravityReservoir Oil ViscosityOil Volume FactorBubble Point PressureOriginal Oil in PlaceReserves Abundance
Subjective Weighting Method0.110.040.120.130.050.060.120.080.050.020.080.14
Entropy Method0.0840.0150.0380.1250.0370.0320.1850.0950.0480.0330.1850.123
Coefficient of Variation Method0.0810.0180.0460.0960.0380.0410.170.0870.0680.0420.1650.148
Combined Subjective–Objective Method0.0960.0280.0850.1210.0430.0480.1480.0840.0540.0280.1270.138
Table 17. Key indicator weights for development parameters.
Table 17. Key indicator weights for development parameters.
Weighting MethodDevelopment Parameters
Well Pattern DensityInitial Production per WellPeak Production RateRecovery Efficiency of ReservesOil Production RateRecovery Factor
Subjective Weighting Method0.160.180.150.160.150.2
Entropy Method0.2030.1950.1750.0930.1520.182
Coefficient of Variation Method0.1930.2350.1430.1140.1360.178
Combined Subjective–Objective Method0.1790.1980.1590.1170.1520.195
Table 18. Test set accuracy (%) of five machine-learning methods for recovery factor prediction in medium-high permeability sandstone oilfields.
Table 18. Test set accuracy (%) of five machine-learning methods for recovery factor prediction in medium-high permeability sandstone oilfields.
MethodSVMRFBPKNNDT
Test Set Accuracy (%)85.8174.168.7567.839.01
Table 19. Test set accuracy (%) of five machine-learning methods for recovery factor prediction in low-permeability sandstone oilfields.
Table 19. Test set accuracy (%) of five machine-learning methods for recovery factor prediction in low-permeability sandstone oilfields.
MethodSVMRFBPKNNDT
Test Set Accuracy (%)69.2963.1861.2744.8918.7
Table 20. Whole asset analogy results and test set accuracy (%) of five machine-learning methods.
Table 20. Whole asset analogy results and test set accuracy (%) of five machine-learning methods.
OilfieldConventional Empirical ClassificationSVMRFBPKNNDT
M111111
WC112111
X111315
KR223225
SA222225
LP222332
D223325
D2333131
BA333345
L333345
WS333335
QSF343445
BA2333345
M2443345
S443335
C443225
GPF444445
R441211
M3555255
HP552225
Test Set Accuracy (%) 9570555525
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, M.; Lei, Z.; Yan, C.; Zeng, B.; Huang, F.; Qu, T.; Wang, B.; Fu, L. Construction of Analogy Indicator System and Machine-Learning-Based Optimization of Analogy Methods for Oilfield Development Projects. Energies 2025, 18, 4076. https://doi.org/10.3390/en18154076

AMA Style

Zhang M, Lei Z, Yan C, Zeng B, Huang F, Qu T, Wang B, Fu L. Construction of Analogy Indicator System and Machine-Learning-Based Optimization of Analogy Methods for Oilfield Development Projects. Energies. 2025; 18(15):4076. https://doi.org/10.3390/en18154076

Chicago/Turabian Style

Zhang, Muzhen, Zhanxiang Lei, Chengyun Yan, Baoquan Zeng, Fei Huang, Tailai Qu, Bin Wang, and Li Fu. 2025. "Construction of Analogy Indicator System and Machine-Learning-Based Optimization of Analogy Methods for Oilfield Development Projects" Energies 18, no. 15: 4076. https://doi.org/10.3390/en18154076

APA Style

Zhang, M., Lei, Z., Yan, C., Zeng, B., Huang, F., Qu, T., Wang, B., & Fu, L. (2025). Construction of Analogy Indicator System and Machine-Learning-Based Optimization of Analogy Methods for Oilfield Development Projects. Energies, 18(15), 4076. https://doi.org/10.3390/en18154076

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop