Thief Zone Assessment in Sandstone Reservoirs Based on Multi-Layer Weighted Principal Component Analysis

: Many factors inﬂuence the evaluation process of thief zones. The evaluation index contains very complex information. How to quickly obtain effective information is the key to improve the evaluation quality for thief zones. Considering that the correlation and information redundancy among the evaluation indexes will seriously affect the evaluation results for the thief zone, based on the principal component analysis (PCA) method, this paper proposes a multi-layer weighted principal component analysis method (MLWPCA). Firstly, factor analysis is performed on the original data to obtain the plurality subsystems of the evaluation index. Then, a principal component is analyzed through the subsystems of the evaluation index PCA to obtain the principal component score. Finally, the subsystem is weighted by the factor score and the comprehensive thief zone score is obtained by combining the subsystem weight and the subsystem score. A case study on the Daqing oilﬁeld shows the effectiveness of the method, veriﬁed by tracer tests when applying the MLWPCA method to evaluate the thief zone. The thief zone of the Daqing oilﬁeld is obviously affected by effective thickness, coefﬁcient of permeability variation and interwell connectivity. At present, there are 10 well developed thief zones and eight medium developed thief zones in Daqing oilﬁeld. The accuracy of this method is 94.44%. Compared with PCA, this method has better pertinence in evaluating thief zones, and is more effective in determining the principle inﬂuencing factors.


Introduction
During long-term water flooding, some geologic factors, such as sand production and clay erosion, and a large number of production factors, such as injection pressure and high recovery rate, will contribute greatly to the heterogeneity of formation structures, which may lead to the widespread formation of thief zones. The thief zone is defined as a laterally continuous stratigraphic unit of relatively high permeability and large pore radius, which has approached residual oil saturation [1]. In reservoirs with thief zones, earlier water breakthrough resulting in uneven sweep of the reservoirs, and the utilization efficiency of the injected water is seriously impaired, which leads to lower oil recovery and more difficulty in undertaking some stimulation measures. Accordingly, the key to enhance oil recovery in the high water cut stage is how to effectively identify thief zones and determine which wells should have their profile modified.
variables and contain information that does not overlap each other. Then the composite principal component score is calculated by using variance contribution rate as weight and the quantitative evaluation is performed. The main steps of principal-component-analysis are: (a) In order to improve the accuracy of data analysis and eliminate the influence of data dimension, the original data of n samples is standardized by using Equation (1): Z ij is the standardized evaluation index; x ij is the evaluation index; x ij , (b) Calculating the correlation coefficient matrix R: r ij = ∑ z kj ·z kj n − 1 , i, j = 1, 2, · · · , p R is correlation matrix; Z is standardized matrix of evaluation index. (c) The eigenvalue λ j of the correlation coefficient matrix R is calculated, and the number of principal components m is determined according to the principle of variance contribution rate greater than 85% ( (d) Calculating the principal component loading: l ij = λ j a ij i = 1, 2, · · · , m; j = 1, 2, · · · , p λ j is the eigenvalue of the correlation-coefficient-matrix; a ij is the orthogonalized unit of the eigenvector of the correlation matrix. (e) Calculating the principal component score F i : F i = a 1j X 1 + a 2j X 2 + · · · a pj X j i = 1, 2, · · · , m; j = 1, 2, · · · , p (4) F j is the score of the ith principal component; X i is the factor score of the ith principal component.
(f) Calculating the composite score Y 1 : Y 1 is the comprehensive score of the principal-component-analysis-method.

Multi-Layer Weighted Principal-Component-Analysis
Because the PCA method has low discrimination in analyzing multi-level and multi-angle evaluation systems, this paper constructs the multi-layer weighted principal component analysis (MLWPCA) method. Its core idea is to divide the index subsystem based on factor analysis, then analyze the main components of each index subsystem, and weight each index subsystem by factor score. Finally, we synthesize the main component analysis result of each index subsystem to obtain a comprehensive score Y 2 . Compared to the PCA method, the MLWPCA method divides the total system into several subsystems in which the number of indicators becomes smaller without changing the number of evaluation sample points. According to the law of large numbers, the larger the number of the evaluation samples is relative to the indicator, the more stable the covariance matrix is, and the higher the evaluation accuracy is. This has greatly improved the stability and credibility of the evaluation results of the thief zone. Moreover, unlike the PCA method which uses a covariance matrix to describe the correlation between the indicators, the MLWPCA method uses the factor rotation load matrix obtained by the maximum orthogonal rotation method to describe the correlation between the indicators. In this way, the variance of the more important indicators in the evaluation system can be elongated and received more attention in the evaluation, which makes the classification of the thief zone more explicit. The main steps of the multi-layer weighted principal-component-analysis are: (a) Carrying out factor analysis on the standardized matrix Z, selecting m principal factors according to the principle that the cumulative variance contribution rate is more than 75 percent, and dividing the system into m subsystems, where each subsystem comprises p indexes. (b) Index subsystem weight calculation formula is: ω i is the subsystem weight; j is the number of indicators; β ij is each factor score coefficient; e i is variance contribution rate, usually (c) Principal-component-analysis is carried out on each index subsystem, and the comprehensive score Y 2 is weighted according to the corresponding weight: Y 2 is the comprehensive score of the multi-layer weighted principal-component-analysis-method; Y 2i is the composite score of the subsystem i.

Example
Taking the Daqing oilfield as the evaluation target, based on the reservoir geological data and production monitoring results, the related thief zone evaluation index is selected, and the MLWPCA method is used to evaluate the thief zone, and the evaluation results are verified by tracer tests.

Overview of Research Blocks
The Daqing oilfield is heterogeneous sandstone reservoirs with positive rhythmic deposition. The burial depth of reservoir is 780~1300 m. The average effective thickness is 43.5 m. The average porosity is 18%. The original oil saturation is 52~61%. The original formation pressure of the reservoir is 11.07 Mpa. The difference of ground saturation pressure is 8.23 MPa. The temperature of the oil layer is 42.7~51 • C. The density of underground crude oil is 0.89 g/cm 3 . There are 11 water injection wells and 36 production wells.
Since 1978, the Daqing oilfield has been developed and experienced three stages. The depletion development mode was used in the early stage. At this stage, the formation pressure dropped rapidly with no stable period. The second stage started with water injection. As the water injected increased, the liquid production in the oil field increased and the decline rate slowed down. The third stage is the full water injection stage. The oil production and water content are all increased greatly at this stage. At present, the Daqing oilfield has entered the high water cut stage. The water cut rose sharply (the comprehensive water cut in 2016 was 71%) and the recovery rate was low (the geological reserves recovery rate in 2016 was 7.23%). The maximum daily water injection for a single well is about 1000 m 3 /d. Average oil pressure of the injection well is 9.8 MPa. The comprehensive water content is 89.8%.
From the geological point of view, the oil reservoir has large thickness and is obviously affected by the gravitational differentiation of oil and water. The reservoir is highly heterogeneous. These characters provide a geological basis for the development of thief zones. According to the production process, the oil field has high water content with a cumulative annual growth rate of water production up to 15%. After the water injection capacity is enhanced, the daily oil production does not increase significantly. The analysis shows a large scale thief zone has appeared in the reservoir. The reservoir heterogeneity is aggravated. The injected water ineffective circulated. The water drive sweep volume is greatly reduced, which will seriously reduce the final recovery. Therefore, it is of great significance to analyze and identify the thief zone to improve the development effect of the oil field. The development status of Daqing oilfield in 2016 is shown in Table 1.

Evaluation Index Selection
In sandstone reservoirs, due to their larger porosity, permeability, and effective thickness, reservoir heterogeneity is prominent. Gravity differentiation between oil and water has great effects on the reservoir which will form thief zones relatively easily. After the thief zone is formed, the resistance of fluid flow through the reservoir decreases and the underground conductivity is enhanced, which may cause a decrease of the pressure difference between injection and production wells, rapid increase of water cut, and a significant increase of the liquid productivity index. If the gray correlation degree is used to characterize the interwell connectivity, the connectivity between injection and production wells will be significantly enhanced after the formation of the thief zone. Based on the basic theory of reservoir engineering combining the characteristics of thief zone, nine evaluation indexes are selected according to systematic, scientific and representative principles, as shown in Table 2.

No.
Index Unit Permeability variation coefficient % x 5 Interwell connectivity 1 Injection-production pressure difference MPa x 9 Liquid productivity index 10 3 m 3 /d·MPa The connectivity between injection and production well can be calculated by Formulas (8)~ (11). The time series of water injection for injection wells is: X 0 is the time series of daily water injection of the injection well, m 3 /d; x 0 is the volume of daily water injection under different times, m 3 /d; t is water injection time, d; n is the total number of water injection days, d.
The oil production time series of a production well connected around is: X i is the time series of daily oil production of production well, m 3 /d; x i is the volume of daily oil production under different water injection times, m 3 /d; t is water injection time, d; n is the total number of water injection days, d.
The correlation coefficient of sequence X i and X 0 is: The correlation degree is defined as: r i is the correlation degree between subsequence i and sequence 0, and n is the sequence length. The data required to calculate connectivity of water injection and oil production are obtained by actual measurements onsite.
The coefficient of permeability variation is calculated by Equation (12): V K is the coefficient of permeability variation, %; K i is the permeability of sample i, µm 2 , K is the average permeability of all samples, µm 2 ; n is the number of samples. The permeability data are obtained by well logging.
The apparent injectivity index and the liquid productivity index are calculated by Equations (13) and (14), respectively: A K is apparent injectivity index, m 3 /d·MPa; A O is liquid productivity index, m 3 /d·MPa; Q w is the volume of daily water injection from injection well, m 3 /d; Q L is the volume of daily oil production from production well, m 3 /d; P w is wellhead pressure of injection well, MPa; P O is wellhead pressure of production well, MPa. The volume of daily water injection and daily oil production, and the wellhead pressure data involved in the calculation are obtained from actual on-site measurements.

Principal-Component-Analysis-Method
The standardized original data of evaluation index for the Daqing oilfield is shown in Figure 1. The principal-component-analysis is performed after that. According to the principle that the cumulative contribution rate of eigenvalues is greater than 85%, three principal components are selected (Eigenvalue > 1). The eigenvalues of the three principal components were 5.499, 1.433 and 1.058 respectively (the Scree plot is shown in Figure 2). The contribution rates of eigenvalue were 61.104%, 15.919% and 11.754% respectively. The cumulative contribution rate of eigenvalue was 88.777%. The principal-component-analysis analyzes the correlation coefficient matrix shown in Table 3.  The correlation coefficient matrix (Table 4) shows that the injection-production-pressure has negative correlation with the other eight evaluation indexes. This is because with the thief zone, the flow resistance of the fluid through the formation decreases resulting in decreased injection-production-pressure. At the same time, an evident negative correlation between injection-production-pressure and interwell connectivity indicates that pressure plays a significant  The correlation coefficient matrix (Table 4) shows that the injection-production-pressure has negative correlation with the other eight evaluation indexes. This is because with the thief zone, the flow resistance of the fluid through the formation decreases resulting in decreased injection-production-pressure. At the same time, an evident negative correlation between injection-production-pressure and interwell connectivity indicates that pressure plays a significant  The correlation coefficient matrix (Table 4) shows that the injection-production-pressure has negative correlation with the other eight evaluation indexes. This is because with the thief zone, the flow resistance of the fluid through the formation decreases resulting in decreased injection-production-pressure. At the same time, an evident negative correlation between injection-production-pressure and interwell connectivity indicates that pressure plays a significant role in driving fluid flow to a local area. A clearly positive correlation between permeability and permeability coefficient is observed for the remaining eight evaluation indexes. This is because the permeability coefficient is obtained by the ratio of the permeability standard deviation and the average permeability, but the physical meaning of these two evaluation indexes is different. Permeability is mainly used to evaluate the development of the thief zone near the well, while the permeability coefficient is used to evaluate the heterogeneity of the entire formation.
Calculating the communalities of the three principal components shows they retain at least 85% of the information from the original data except for the index of liquid productivity and water content information. It shows that the principal-component-analysis (PCA) has good effects on dimension reduction and simplifying the original complex multi-dimensional evaluation system. The results of the calculation of the communalities are shown in Table 4. The load values of the three principal components are calculated (the load diagram is shown in Figure 3). The functional expression of each principal component is obtained according to the load values as follows. role in driving fluid flow to a local area. A clearly positive correlation between permeability and permeability coefficient is observed for the remaining eight evaluation indexes. This is because the permeability coefficient is obtained by the ratio of the permeability standard deviation and the average permeability, but the physical meaning of these two evaluation indexes is different. Permeability is mainly used to evaluate the development of the thief zone near the well, while the permeability coefficient is used to evaluate the heterogeneity of the entire formation.
Calculating the communalities of the three principal components shows they retain at least 85% of the information from the original data except for the index of liquid productivity and water content information. It shows that the principal-component-analysis (PCA) has good effects on dimension reduction and simplifying the original complex multi-dimensional evaluation system. The results of the calculation of the communalities are shown in Table 4. The load values of the three principal components are calculated (the load diagram is shown in Figure 3). The functional expression of each principal component is obtained according to the load values as follows.   According to the scores of principal components, the comprehensive evaluation scores Y1 (shown in Table 4) of the thief zone between injection and production wells in Daqing oilfield are obtained, ranging from −2.600 to 4.704. The thief zone developed with the increase of comprehensive score. When the comprehensive score is negative, the seepage channel between injection and production wells is in good condition and the water drive power is sufficient. When the comprehensive score is greater than 0.0 but less than 1.0, the thief zone is moderately developed. At this time, close observation and appropriate measurements are needed to avoid According to the scores of principal components, the comprehensive evaluation scores Y 1 (shown in Table 4) of the thief zone between injection and production wells in Daqing oilfield are obtained, ranging from −2.600 to 4.704. The thief zone developed with the increase of comprehensive score. When the comprehensive score is negative, the seepage channel between injection and production wells is in good condition and the water drive power is sufficient. When the comprehensive score is greater than 0.0 but less than 1.0, the thief zone is moderately developed. At this time, close observation and appropriate measurements are needed to avoid further development into a big pore throat. When the comprehensive score is greater than 1.0, the thief zone is well developed and water channeling has occurred. A large amount of injected water is circulating inefficiently between injection and production wells. According to Figure 4, the Daqing oilfield has 10 well developed thief zones, accounting for 19.61% of the total number of thief zones; and eight moderately developed thief zone, accounting for 15.69% of the total number of thief zones. Improvements, such as profile control and water plugging, are needed for the injection-production wells with thief zones to avoid further development which may influence the final recovery ratio of the oilfield.
Energies 2018, 11, x FOR PEER REVIEW 9 of 13 further development into a big pore throat. When the comprehensive score is greater than 1.0, the thief zone is well developed and water channeling has occurred. A large amount of injected water is circulating inefficiently between injection and production wells. According to Figure 4, the Daqing oilfield has 10 well developed thief zones, accounting for 19.61% of the total number of thief zones; and eight moderately developed thief zone, accounting for 15.69% of the total number of thief zones. Improvements, such as profile control and water plugging, are needed for the injection-production wells with thief zones to avoid further development which may influence the final recovery ratio of the oilfield.

Multi-Layer Weighted Principal-Component-Analysis-Method
The PCA method cannot accurately extract and analyze the development of the thief zone in the same class. In order to improve the accuracy of the evaluation results using PCA, the MLWPCA method was constructed. Firstly, nine indexes are classified according to the result of factor analysis. According to the principle of cumulative contribution rate, greater than 75%, two factors are selected. The contribution rate is 61.104% and 15.919% respectively. The rotated load matrix is shown in Table 5. The score coefficient matrix obtained by the maximum orthogonal rotation method shows that the load values of x1, x3, x4, x7 and x9 evaluation indexes are higher in factor 1. These indexes are grouped into index subsystem 1. In factor 2, x2, x5, x6, x8, the index load is higher. These indicators are divided into index subsystem 2. Using Equation (8) for weight calculation the weights of subsystem 1 and 2 are 0.777 and 0.223, respectively. PCA was carried out for the evaluation index of the above two thief zone subsystems. The contribution rates of subsystem 1 and subsystem 2 were 90.302% and 86.157%, respectively. The synthesis scores Y21 and Y22 (Y22' and Y22'') are expressed in the following equations. Based on these

Multi-Layer Weighted Principal-Component-Analysis-Method
The PCA method cannot accurately extract and analyze the development of the thief zone in the same class. In order to improve the accuracy of the evaluation results using PCA, the MLWPCA method was constructed. Firstly, nine indexes are classified according to the result of factor analysis. According to the principle of cumulative contribution rate, greater than 75%, two factors are selected. The contribution rate is 61.104% and 15.919% respectively. The rotated load matrix is shown in Table 5. The score coefficient matrix obtained by the maximum orthogonal rotation method shows that the load values of x 1 , x 3 , x 4 , x 7 and x 9 evaluation indexes are higher in factor 1. These indexes are grouped into index subsystem 1. In factor 2, x 2 , x 5 , x 6 , x 8 , the index load is higher. These indicators are divided into index subsystem 2. Using Equation (8) for weight calculation the weights of subsystem 1 and 2 are 0.777 and 0.223, respectively. Liquid productivity index 0.703 0.566 0.134 0.068 PCA was carried out for the evaluation index of the above two thief zone subsystems. The contribution rates of subsystem 1 and subsystem 2 were 90.302% and 86.157%, respectively. The synthesis scores Y 21 and Y 22 (Y 22 ' and Y 22 ") are expressed in the following equations. Based on these scores Y 21 and Y 22 , the synthesis score Y 2 of thief zone is calculated shown in Figure 5. The thief zones of the Daqing oilfield are evaluated by the comprehensive score Y 2 . Through the comprehensive score Y2, it can be seen that there are 10 well developed thief zones in Daqing oilfield, eight moderate developed thief zones, and the remaining 34 thief zones are not formed. By comparing and analyzing the original basic data, we found that the effective thickness and the coefficient of permeability variation are the main indexes affecting the development of thief zone in subsystem 1. Such a thief zone has a large effective thickness. The formation of thief zones is clearly affected by the oil-water gravity differentiation. The reservoir is highly heterogeneous, which provides the geological basis for developing thief zones. Subsystem 2 developed a more advanced thief zone than subsystem 1. Interwell connectivity is the main indicator affecting the development of dominant channels in subsystem. 2. Such a thief zone has large throat radius, strong fluid diversion between wells, and very fast rate of further deterioration. Therefore, the corresponding measurements should be taken as soon as possible to control it. It can be concluded from the above analysis that the MLWPCA method is more targeted and differentiated than the traditional PCA method. It is an effective evaluation method for thief zone determination worthy of popularization.
By comparing with the thief zone identification results using the PCA method in Section 3.3, we found that the two methods give consistent identification results for the thief zone. The thief zones,J1, J3, J5, J7, J9, J11, J13, J15, J23 and J31, are seriously developed; while the thief zones,J8, J12, J16, J17, J19, J21, J25, and J27, are moderately developed. Comparing the composite scores of the thief zones between the two methods (as shown in Table 6), we found more obvious differences in the comprehensive score for the thief zones obtained using the MLWPCA method than from PCA method. This can highlight the differences in grade among different thief zones. This can also make the evaluation results more specific and differentiated.
In the process of thief zone evaluation, an interwell tracer test was used to verify the accuracy of the MLWPCA method. As shown from the test results (Table 6), tracer was detected at 10 well developed thief zones by the MLWPCA method, and the average tracer breakthrough time was 6.4 months. Tracer was detected in seven of the total of eight moderately developed thief zones with an average breakthrough time of 10.6 months. Tracer was not detected for J8 interwell which may be due to the relatively small amount of injected tracer not reaching the oil well. According to the Through the comprehensive score Y 2 , it can be seen that there are 10 well developed thief zones in Daqing oilfield, eight moderate developed thief zones, and the remaining 34 thief zones are not formed. By comparing and analyzing the original basic data, we found that the effective thickness and the coefficient of permeability variation are the main indexes affecting the development of thief zone in subsystem 1. Such a thief zone has a large effective thickness. The formation of thief zones is clearly affected by the oil-water gravity differentiation. The reservoir is highly heterogeneous, which provides the geological basis for developing thief zones. Subsystem 2 developed a more advanced thief zone than subsystem 1. Interwell connectivity is the main indicator affecting the development of dominant channels in subsystem. 2. Such a thief zone has large throat radius, strong fluid diversion between wells, and very fast rate of further deterioration. Therefore, the corresponding measurements should be taken as soon as possible to control it. It can be concluded from the above analysis that the MLWPCA method is more targeted and differentiated than the traditional PCA method. It is an effective evaluation method for thief zone determination worthy of popularization.
By comparing with the thief zone identification results using the PCA method in Section 3.3, we found that the two methods give consistent identification results for the thief zone. The thief zones, J1, J3, J5, J7, J9, J11, J13, J15, J23 and J31, are seriously developed; while the thief zones, J8, J12, J16, J17, J19, J21, J25, and J27, are moderately developed. Comparing the composite scores of the thief zones between the two methods (as shown in Table 6), we found more obvious differences in the comprehensive score for the thief zones obtained using the MLWPCA method than from PCA method. This can highlight the differences in grade among different thief zones. This can also make the evaluation results more specific and differentiated.
In the process of thief zone evaluation, an interwell tracer test was used to verify the accuracy of the MLWPCA method. As shown from the test results (Table 6), tracer was detected at 10 well developed thief zones by the MLWPCA method, and the average tracer breakthrough time was 6.4 months. Tracer was detected in seven of the total of eight moderately developed thief zones with an average breakthrough time of 10.6 months. Tracer was not detected for J8 interwell which may be due to the relatively small amount of injected tracer not reaching the oil well. According to the tracer test, the accuracy rate of evaluating the thief zones using the MLWPCA method is 94.44%. This method is proved to be effective in identifying thief zones.