Physical Scaling of Oil Production Rates and Ultimate Recovery from All Horizontal Wells in the Bakken Shale

A recent study by the Wall Street Journal reveals that the hydrofractured horizontal wells in shales have been producing less than the industrial forecasts with the empirical hyperbolic decline curve analysis (DCA). As an alternative to DCA, we introduce a simple, fast and accurate method of estimating ultimate recovery in oil shales. We adopt a physics-based scaling approach to analyze oil rates and ultimate recovery from 14,888 active horizontal oil wells in the Bakken shale. To predict the Estimated Ultimate Recovery (EUR), we collapse production records from individual horizontal shale oil wells onto two segments of a master curve: (1) We find that cumulative oil production from 4845 wells is still growing linearly with the square root of time; and (2) 6,401 wells are already in exponential decline after approximately seven years on production. In addition, 2363 wells have discontinuous production records, because of refracturing or changes in downhole flowing pressure, and are matched with a linear combination of scaling curves superposed in time. The remaining 1279 new wells with less than 12 months on production have too few production records to allow for robust matches. These wells are scaled with the slopes of other comparable wells in the square-root-of-time flow regime. In the end, we predict that total ultimate recovery from all existing horizontal wells in Bakken will be some 4.5 billion barrels of oil. We also find that wells completed in the Middle Bakken formation, in general, produce more oil than those completed in the Upper Three Forks formation. The newly completed longer wells with larger hydrofractures have higher initial production rates, but they decline faster and have EURs similar to the cheaper old wells. There is little correlation among EUR, lateral length, and the number and size of hydrofractures. Therefore, technology may not help much in boosting production of new wells completed in the poor immature areas along the edges of the Williston Basin. Operators and policymakers may use our findings to optimize the possible futures of the Bakken shale and other plays. More importantly, the petroleum industry may adopt our physics-based method as an alternative to the overly optimistic hyperbolic DCA that yields an ‘illusory picture’ of shale oil resources.


Introduction
In early 2019, seven major shale plays in the US produced nearly 70% of total US oil production. The Bakken shale is the second largest oil-producing play that in mid-2019 produced 1.45 million bbl of oil per day from 14,888 active horizontal hydrofractured wells. Therefore, it is important to have a reliable predictive model of oil production and ultimate recovery in the Bakken and other shale plays. In shale or tight oil reservoirs, where drainage area and communication between wells are limited [1,2], the estimated ultimate recovery (EUR) per well is the key indicator used to assess fieldwide production potential. The most popular way of predicting EUR from a shale well is the traditional decline curve analysis (DCA) proposed by Arps [3] 75 years ago. This method is favored by the industry, because it only requires production data to forecast future performance of each well and it is easy to implement for many wells. Most published fieldwide predictions of EUR use DCA. Good examples are the Bakken predictions by the US Geological Survey [4,5] and US DOE Energy Information Administration [6].
In his seminal paper, Arps [3] stated that DCA works by extrapolating oil rate observed in a pseudo-steady state (boundary-dominated) flow regime. Unfortunately, because of the vanishing matrix permeability, shale wells do not reach pseudo-state flow regime during their entire existence. Thus, traditional DCA commonly overestimates EURs of shale wells [7,8]. In early 2019, an analysis published by the Wall Street Journal (WSJ) revealed that 2/3 of the projections made by shale companies appeared to be overly optimistic [9,10]. Thousands of shale wells drilled between 2014 and 2017 are pumping on average 10% less oil and gas than their owners' forecasts. In certain regions, the discrepancy between forecasts and actual production is more than 50%. The total deficit is equivalent to one billion barrels of oil and gas that will not be produced over 30 years or to $30 billion at current oil price.
Several authors have suggested alternatives to the traditional DCA. These are, e.g., the Power Law Exponential Decline [11], the Stretched Exponential Decline [12,13], the Logistic Growth Model [14], the Bayesian probabilistic DCA [15,16], and the Extended Exponential DCA [17]. Empirical models are simple, but they also are mere curve fits, whose parameters have little physical meaning and cannot be generalized. In hyperbolic DCA, for example, the exponent b is adjusted to fit production data, notwithstanding several studies that have tried to elucidate its physical meaning [18][19][20]. Another limitation of most empirical models is their inability to capture the two regimes of oil and gas production in shales that are transient-linear flow at early times and boundary-dominated flow at later times on production [21]. We posit that the physics-based analytic or numeric solutions (such as [22,23]) are more predictive than the empirical decline curves. Unfortunately, many analytical solutions proposed in recent years are unpopular because of their complexity and large quantities of input data.
In this paper, we reintroduce our physics-based scaling approach that is based on an analytical solution of transient oil flow. In several other papers, notably in [14,21,[24][25][26][27], we have reviewed the vast literature on this subject and, to avoid repetition, we omit it here. The physical scaling approach was originally developed to predict gas production in the Barnett shale [24,25]. In 2017, we extended our earlier gas production scaling to predict EURs from thousands of oil wells in the Eagle Ford [28]. In this paper, in turn, we extend the Eagle Ford scaling to predict EURs from the 14,888 existing horizontal, hydrofractured black oil wells in the Bakken. We consider the simplest model of oil production consistent with the first order physics and idealized geometry of the extraction process. In principle, the solution of this model depends on many parameters, but in practice-for a given shale play-all but two can be fixed at a field region-average values, resulting in an exact solution of a linearized transient pressure diffusion problem with simplified geometry of the hydrofractured horizontal wells as the boundary condition. At early times, the scaled oil production rate declines as one over the square root of dimensionless time on production, reflecting transient oil flow. Later, oil production rate declines exponentially indicating boundary-dominated flow. In addition, we combine the physical scaling approach with time superposition to resolve erratic production records of the refractured old Bakken wells, and of the wells that were put on artificial lift with time-varying downhole conditions.
To predict EURs, we collapse production records from individual horizontal oil wells onto two different segments of an almost universal scaling curve. The scaling is easy to program and works perfectly for large production data sets. Even though our approach is mechanistic, it preserves the simplicity of traditional DCA by relying only on average reservoir rock and fluid properties and on well-by-well time series of oil production. In a slightly different context, our physical scaling approach has been proven to have the same predictive power as full reservoir simulation, while being several orders of magnitude faster [29]. The EURs calculated from the physical scaling are compared with the latest results published by EIA.
We map water cuts, gas-oil ratios, and calculated EURs of Bakken wells to locate the best producing region in the Williston basin. We also analyze how recent advancements of well completion technologies influence EUR. Operators and policymakers may use our findings to guide sensible development of the Bakken and other shale plays. More importantly, petroleum industry may adopt our simple, fast and accurate method to predict oil production in shales as an alternative to the overly optimistic hyperbolic DCA that yields an 'illusory picture' of shale oil resources [9].

Physical Scaling
The physical scaling approach extended in this paper was first derived in the paper SPE-187226 [28], and used to predict oil production from thousands of horizontal wells in the Eagle Ford shale play. The scaling solution given in SPE-187226 is based on an assumption that oil is produced from a hydrofractured shale well under a natural depletion mechanism (Section 2.1.1). If a well intervention, most notably refracturing of old wells, occurs, this well's production cannot be scaled with a single master curve. Therefore, we implement superposition in time to address the erratic production records caused by well refracturing (Section 2.1.2), or putting a well on artificial lift, or changing downhole flowing pressure, or a combination of two or more of these factors. Figure 1a illustrates the simple conceptual model used to derive our scaling method. The fracture height, H, is assumed to be the formation thickness; 2L is the tip-to-tip hydrofracture length; and 2d is the distance between two consecutive hydraulic fractures (or stages). M is the mass of initial oil in place inside of the stimulated reservoir volume (SRV). In this model, radial flow into the wellbore is neglected, because the hydrofracture permeability is much higher than that of the rock matrix. As a result, we can assume bi-linear flow towards hydraulic fracture planes and write a one-dimensional nonlinear pressure diffusion equation inside SRV. The oil production rate

Physical Scaling of Natural Depletion
is obtained from the solution of the nonlinear-in general-oil pressure diffusion equation Here α is the pressure diffusivity coefficient that depends on the reservoir fluid pressure; ρ fluid is the bulk density or reservoir fluid that may contain free gas in the near-hydrofracture regions; and c t is the total system compressibility near bubble point pressure. For the derivation details, please see [28] The initial and boundary conditions for this problem are p(x, t = 0) = p i , Initial pressure is uniform p(x ± 1, t > 0) = p f , Hydrofracture pressure is constant during production (4) Briefly, with several assumptions that reflect the simplified physics of oil flowing mostly above the bubble point, the initial and boundary value problem (IBVP) (Equations (2)-(4)) was solved analytically [28], and the following nonlinear least square optimization problem was obtained min M>0,τ>0 This problem has two unknowns, and where M is the initial mass of oil in place inside the stimulated reservoir volume; 2dN is the total length of horizontal well; and τ is the characteristic time of pressure interference, or the square of interfracture half-distance, d, divided by the hydraulic diffusivity, α i , at initial reservoir conditions. The parameter τ is the characteristic time needed for the depletion of oil pressure to diffuse to the midplane between two consecutive hydrofractures.

Recovery Factor
Dimensionless time Reproduced from [28]. (b) Illustration of the (almost) universal scaling curve method. The black line is the master curve that is the solution of the pressure diffusivity equation in hydrofractured well geometry. This master curve scales initially as the square root of time that later slows down due to exponentially declining rate of production. The constant C governs the vertical stretching of the master curve and can be calculated as C = c ti /S oi (P i − P f ). The cumulative mass produced by individual wells is then adjusted to match the master curve by a stretching/shrinking factor of M along the y-axis and τ along the x-axis. Reproduced from [29].
The complex, highly nonlinear optimization problem in Equation (5) is difficult to solve numerically. Thus, a simpler but equivalent "physical scaling curve method" was introduced and illustrated in Figure 1 (right). This method consists of two steps. First, a master curve is constructed using Equation (8) with dimensionless time,t = t τ , as the x-axis and dimensionless recovery factor, RF, as the y-axis. Second, for each well, its recorded cumulative produced mass is matched to the master curve by scaling the x-axis with τ and y-axis with M. Please note that these two steps are equivalent to solving the nonlinear optimization problem in Equation (5). The left hand side of Equation (5), , is the monthly mass production, N p ; and the right hand side is the recovery factor in Equation (8) multiplied by the accessible mass in place, M (remember thatt = t τ ).
being the ultimate oil recovery. To make our scaling method even more 'engineer-friendly', we provide the best closed-form approximation of the exact solution in Equation (8): For the Bakken, the optimal values of constants λ and β are 1.3 and 0.55, respectively. The universal mathematical constant λ ∼ 1.3 was obtained by Conway in his lost Cosmological Theorem, which asserts that after a certain number of steps from the Big Bang-the beginning of a Conway sequence of integers [30,31]-all the exotic elements, all the things that are not compounds of common atoms, of digit 1 (hydrogen) for example, disappear and only common atoms are left, also see [32] pp. 452-455. The power exponent, β, was obtained from a fit. Equation (10) is much easier to calculate than the summation of the infinite series in Equation (8), and may be computed with Excel R . Now, the linearized oil pressure diffusion equation can be solved painlessly, well-by-well, and the resulting set of fit parameters, {τ, M}, provides key insights into the physics of production of the mostly saturated shale oil from thousands of wells. Finally, each EUR is calculated as the maximum recovery factor value from the master curve, RF(t max ), multiplied by the mass of oil in place, M.

Physical Scaling of Well Refracturing
In many cases, the first hydraulic fracturing job is unsuccessful and results in a poor quality well that is a candidate for refracturing at a later time. Refracturing contributes to sudden jumps in production rate. The refracturing process is illustrated as an increase of the SRV in Figure 2a. As before, M is the initial mass of oil in place in the SRV. Before refracturing, the SRV box contains M 0 ktons of oil in place. At time t 1 , the first refracturing job is completed, and the mass in place jumps to M 1 . At t 2 , the second refracturing job is performed, and an even bigger M 2 is accessed. The refracturing process increases the mass M used to rescale cumulative oil production. By superposition theorem, this oil production is a summation of the production before refracturing that accessed the original mass in place, M 0 , and secondary production from imaginary wells at times t j , j = 1, 2, . . . that access the incremental masses in place, (M j − M j−1 ).

Recovery Factor
Scaled production data Total master curve (b) A modification of the physical scaling curve coupled with superposition solves well refracturing problem. If primary production before refracturing has the recovery constant C (cf. Figure 1), then each refracturing event at t j will start an imaginary well with the recovery constants Ca j . Here a j scales the effectiveness of each fracturing job relative to the initial mass in place (a j = (M j − M j−1 )/M 0 ). Finally, the total master curve is scaled as C T = C(1 + a 1 + ... + a n ) to match the discontinuous historical production.
The production jumps from refracturing cannot be scaled with the standard master curve shown in Figure 1. Thus, we combine the physical scaling solution with superposition and obtain Equation (12) that remedies this difficulty. If primary production before refracturing has the recovery constant C in Equation (13), then each refracturing event j at time t j activates an imaginary well with the recovery constant Ca j . Here a j 's are the new fitting parameters that scale effectiveness of each refracturing job relative to the initial mass in place, Equation (14). H j is the Heaviside step function that handles time delays in the consecutive refracturing jobs.
The total master curve RF T is the summation of all individual master curves, RF j , see Equations (16) and (17). An example of matching the scaled discontinuous historical production is shown in Figure 2b.
Finally, one calculates EUR by multiplying the maximum total recovery factor from the modified master curve, RF T (t max ), with the initial mass of oil in place before fracturing, M 0 , Equation (18).

Data Collection and Scaling Procedure
We have mined public well records for the Bakken shale from the DrillingInfo (now Enverus) database, the Montana Department of Natural Resources and Conservation-Board of Oil and Gas Conservation website, and the North Dakota Department of Mineral Resources-Oil and Gas Division website with a premium subscription. In DrillingInfo, we have selected only 14,888 horizontal oil wells that were completed after 1 January 2000, in the Middle Bakken and Three Forks formations. The raw production data from DrillingInfo were imported into an integrated MATLAB R software package we designed to perform data cleanup, consolidation, and unit conversions. The clean production data for each well consist of a vector of elapsed time on production (consecutive months with nonzero oil volume records) and a vector of cumulative oil production in kilotons.
Our scaling procedure of obtaining EUR for each well is as follows: 1. Exclude all newly completed wells with less than 12 months of production.

2.
For each remaining well, plot its cumulative production vs. square root of time on production. Classify these wells as: (a) Non-interfering wells, if the plot shows a straight line (with, e.g., Interfering wells, if the plot shows a deviation from a straight line (with, e.g., Refrac wells, if production jumps are detected.

3.
For each non-interfering well, scale its cumulative production by K √ t max , on the y−axis and the elapsed time on production by t max on the x−axis to match the line f (t) =

√t
. To predict EUR, assume that deviation from the line starts at τ = 2t max . Thus, the corresponding M can be calculated as M = ( For each interfering well, scale its cumulative production records to the master curve in Equation (8) using two scaling parameters, τ and M. The corresponding EURs are calculated from Equation (11).

5.
For each refrac well, scale its cumulative production records to the master curve in Equation (16), and obtain optimum values of the scaling parameters τ, M 0 , and a j , j = 1, 2, . . . . The corresponding EURs are calculated from Equation (18). 6.
For each newly completed well, calculate its EUR using expected values of τ and M from comparable interfering wells that were completed between 2017 and 2019. Use Equation (11).
Notice that the EURs calculated from Equations (11) and (18) are in mass units. We use oil density to convert these EURs into thousands or millions of barrels (kbbl or Mbbl).

Results and Discussion
A quick scoping analysis shows the distribution of oil rates from all Bakken wells after-in this example-one year on production, see Figure 3. This distribution is positively skewed, meaning that most Bakken wells have low oil rates and the best wells with the significantly higher oil rates in the long tail are rare. The Generalized Extreme Value (GEV) distribution is most suitable to model the empirical oil rate distribution in the Bakken and other shales [27,33,34].
The probability density function (PDF) and (b) the cumulative distribution function (CDF) of oil rate for the Bakken wells after 12 months on production. The field data were fit with the generalized extreme value (GEV) distribution function that perfectly matches the experimental probability distributions. The P 50 well is the expected value of the distribution, the P 10 well is exceeded by only 10% of wells, and the P 90 well is exceeded by 90% of wells.
Using the GEV statistics of well populations, we calculate the high (P 10 ), mean (P 50 ), and low (P 90 ), oil rate values from the cumulative distribution function (CDF). The P 50 well is the expected value of the distribution; the P 10 well is exceeded by only 10% of wells; and the P 90 well is exceeded by 90% of wells. Figure 4 shows the daily oil rates and cumulative oil production for all existing horizontal wells in the Middle Bakken and the Upper Three Forks. Notice that the Three Forks wells have fewer years on production because they were drilled after 2010. The P 10 , P 50 , and P 90 wells were calculated with the method discussed elsewhere [27,34] and in the caption of  In addition, many production jumps are observed. Each jump is a sudden discontinuity of oil rate that is sometimes larger than the initial oil rate in the same well, and an abrupt brake in cumulative production curve. These discontinuities may be caused by refracturing of old wells [35][36][37][38][39] or by changes of downhole well flowing pressure [40][41][42][43].

Physical Scaling Matches
We have scaled oil production from all 14,888 horizontal oil wells in the Bakken following the procedure detailed in Section 2.2. The reservoir properties used in scaling oil production are listed in Table A2. The key scaling results are shown in Figures 5-7.
At the time of this analysis, 4845 Bakken wells, 3349 in the Middle Bakken and 1496 in the Upper Three Forks, were classified as non-interfering wells, because their cumulative production was still growing linearly versus the square root of time. The hyperbolic decline (∝ 1/ √t ) of oil production rate from these wells is caused by the purely transient diffusion of oil pressure in the reservoir. Notice that the x-axis in Figure 5 is scaled with the maximum time on production, t max instead of τ and the y-axis is scaled with K √ t max to match the f (t) = √t master curve that intersects point (1,1). In the Middle Bakken, the expected values of t max and K √ t max are 75 months and 30 ktons, respectively, while in Three Forks these values are 65 months and 27 ktons. For the non-interfering wells, we assume arbitrarily that they start to decline exponentially (go under the square-root-of-time linear trend) at The interfering wells exhibit clear pressure interference between consecutive hydrofractures and are in pseudo-steady state flow regime. For each well k, this interference already occurred at the elapsed time on production t = τ k and, at the time of this analysis, each oil production rate was declining exponentially. For these wells, the recovery factors start bending down at the dimensionless time equal 0.5 < 0.64 in [24] because oil is less compressible than gas. The interfering wells scale with the master curve in  Many wells in the Bakken (2363) cannot be scaled to a non-interfering or interfering segment of the scaling curve. The poor matches are caused by multiple production jumps that may be due to refracturing of old wells [35][36][37][38][39]. We note that these jumps could also result from artificial lift and step decreases of downhole well pressure [40][41][42][43]. Each of these wells is scaled individually with a modification of the full physical scaling curve coupled with superposition (illustrated in Figure 2). Four examples of "refrac wells" are shown in Figure 7. The wells in panels a and c have been completed in the Middle Bakken formation, while the those in panels b and d in the Upper Three Forks formation. As we can see, the modified scaling curve matches the discontinuous production records. The magnitude of each a in Equation (14) may reflect effectiveness of a specific refracturing job. For all "refrac wells," the expected value of a is about 0.6, a little better than 50%.
The remaining 1279 newly completed wells had less than 12 months on production at the time of this analysis. The oil production rates from these wells are usually unstable due to backflow of fracturing fluids, excessive choking and/or lack of takeaway capacity [24,25,[44][45][46][47][48].
During the first three months on production in each well, the produced cumulative oil volume is commonly below the square-root-of-time line due to the unloading of SRV from fracturing fluids, whose flow blocks oil production, [24,25]. If a well does not have enough of stable production history, its scaling will underestimate both τ and M. Therefore Notice that the magnitude of τ is lower and that of M is higher for the new wells. These shifts are caused by the improvements of well completion technology over the last decade [49][50][51][52][53][54][55], i.e., by bigger hydrofractures, fewer clusters per stage, more fracture stages, etc.  Table A3 details the interference times, τ, and the masses of oil in place, M, for the four well classes: (1) interfering, (2) non-interfering, (3) refracs, and (4) newly completed wells in the Middle Bakken and Upper Three Forks. The surface location maps of these four well classes are shown in Figure 8 for both reservoirs. For the Middle Bakken, wells in the first three classes spread extensively from the center of the Williston Basin in North Dakota to the Elm Coulee Field in Montana, while in the Three Forks few wells were drilled in Montana. Interestingly, the newly completed wells are located mostly in the central part of the Williston Basin in North Dakota. There were some new wells completed in Montana; however, they became inactive within the first few months on production. In general, the magnitudes of τ and M in the Upper Three Forks are lower than those in the Middle Bakken, indicating that the Three Forks wells decline faster and have lower EURs than those in the Middle Bakken. The refrac wells have the highest values of M, because of the incremental stimulated reservoir volumes due to refracs. The non-interfering wells have slightly higher EURs than the interfering wells, because of the larger τ and M values. As mentioned before, the newly completed wells have shorter τs and larger Ms than all other well classes. Each of these wells starts from a very high initial production rate that declines very fast.

EUR Predictions
To predict the fieldwide EUR, all individual scaling curves for the 14,888 existing wells in the Bakken are summed up vs. calendar time (Figure 9). For each well class, we note the high fidelity of the physical scaling approach in retracing old production and-by implication-forecasting future production of the existing wells. From Figure 9 (left), we infer that the second production peak in the last three years is not only the contribution from the newly completed wells that are highly productive, but also from the old wells that have been refractured or pumped off recently. From Figure 9 (right), it follows that the expected estimated ultimate recovery from all 14,888 wells in the Bakken play is about 715 million m 3 (4.5 billion bbl) of oil. The EURs for the interfering and non-interfering well classes are almost the same at about 240 million m 3 (1.5 billion bbl) each, as the respective well numbers are close: 6401 and 4845. The refrac wells have a larger average EUR per well with total EUR of 160 million m 3 (1 billion bbl) from only 2363 wells. The recent 1279 wells also have a higher EUR per well with the total EUR of 80 million m 3 (0.5 billion bbl). One can argue that success of the newly completed wells in yielding the highest EURs is not only due to the significant improvements in well completion techniques, but also to the favorable drilling location selections across the most prolific area in the Bakken, near the center of the Williston Basin (recall Figure 8).  Figure 10 shows that our physical scaling method and the hyperbolic decline curve analysis (DCA) give different estimates of oil production from the Bakken shale. To solve for the matching parameters of hyperbolic DCA, initial oil rate, q i ; initial decline rate, D i ; and the hyperbolic exponent, b, we used the algorithm described in the Reservoir Engineering Handbook (Fourth Edition) by Tarek Ahmed [56]. In addition, we wrote a MATLAB R program to automate the hyperbolic DCA matches of 14,888 Bakken oil wells. Figure 10a,b are examples of the respective matches of oil production from well -25-083-XXXXX. The rate is plotted on a semilog scale and there seem to be no large discrepancies between the physical scaling (blue curve) and hyperbolic DCA (red curve) matches of the past oil rate (black line). However, when we integrate the rate estimates from both methods to obtain cumulative production, the hyperbolic DCA fails to trace past production and gives an overly optimistic cumulative oil of 64,000 m 3 (400 kbbl) at the last production time recorded. In contrast, the physical scaling yields 55,600 m 3 (350 kbbl) matching exactly historical data. We also quality-checked both methods for 11,246 non-refractured wells in the Bakken, see Figure 10c,d. For each well, each point in these cross-plots quantifies agreement between the actual annual rate (the volume of oil produced during 12 months on production in kbbl/year), in year 1, 2, . . . , and the corresponding values predicted with both methods. There are 61,230 points in each of the two cross-plots. Ideally, these points would follow the diagonal line, y = x. For DCA, best data fit, y = 1.12x + 1.14, (the blue line) is biased towards excessive predicted production volumes. The linear model explains the data almost perfectly and its 95% confidence interval (CI) is not plotted. The two dashed blue lines denote the 95% CI band that a new data point in the same set of wells would fall inside of this band. The physical scaling model is not biased, y = 0.99x − 0.13, and its 95% CI is narrower. Finally, we stacked up all rate estimates from the 14,888 oil wells in the Bakken to obtain total field rate shown in Figure 10e. At the field scale, the systematic upward bias of hyperbolic DCA is amplified, thus, the red line fails to track the black line. In contrast, the stack of individual physical scaling matches (blue line) traces historical production. The total field cumulative is shown in Figure 10f. In the end, there is a big discrepancy between both methods of predicting total field EUR from the same field production data set. The physical scaling predicts the EUR from the 14,888 existing wells at 715 million m 3 (4.5 billion bbl), while the hyperbolic DCA is overly optimistic, giving a 45% higher estimate of 1 billion m 3 (6.5 billion bbl).
The current model is a mechanistic scaling of well production in the Bakken shale based on the first order physics of production mechanisms in hydrofractured horizontal wells and on geology. It served as the physical foundation of the statistical prediction [34] of ultimate production in the Bakken based on the Generalized Extreme Value (GEV) statistics. The current scaling model provides an unbiased match of all historical data from all wells in the Bakken. The width of the 95% CI in Figure 10d reflects the natural variability of reservoir and well properties that influence the model predictions around their expected values listed in Table A2. The 61,230 points that make the comparison in Figure 10d demonstrate that the stochastic variability of reservoir and well properties is rather small and the model does an excellent job finding the unbiased values of original oil in place M and pressure interference time τ for each of the 14,888 wells in the Bakken.  ,d) show the quality of individual matches of 11,246 non-refractured wells in the Bakken. Overall, our physical scaling matches the actual annual production better than the hyperbolic DCA (according to the R 2 values). Individual matches of all wells in the Bakken with both methods are stacked up to reconstruct the past total field oil rate (e) and cumulative production (f). The physical scaling traces perfectly historical production. In contrast, the hyperbolic DCA systematically overestimates the field EUR, yielding 1 billion m 3 (6.5 billion bbl), an estimate that is 45% higher than that obtained with physical scaling.
To compare our results with EUR predictions from the US DOE Energy Information Administration [6], we have remapped each Bakken assessment unit onto counties and calculated the expected value of EUR for each county. The details are tabulated in Table A4. We predict the EUR from all 14,888 active horizontal oil wells in the Bakken to be 715 million m 3 (4.5 billion barrels of oil) or ∼48 thousand m 3 /well (300 kbbl/well), with 493 million m 3 (3.1 billion barrels) coming from 9894 wells in the Middle Bakken and 222 million m 3 (1.4 billion bbl) coming from 4994 wells in the Upper Three Forks . The results in Table A4 are also displayed as stacked bar graphs for the Middle Bakken and the Upper Three Forks (Figure 11). The stacked bars show the numbers of existing wells from the DrillingInfo dataset in 2019, and the potential wells from [6] in each Bakken Total Petroleum System (TPS) assessment unit mapped onto the corresponding counties in the Middle Bakken and Upper Three Forks Formations. The lines show the corresponding EUR values from the physical scaling and the EIA statistical predictions. The P 10 , P 50 , and P 90 wells were obtained by applying the generalized extreme value (GEV) distribution [27,34] for each set of EURs from the scaling curve method in each sub-region as demonstrated before in Figure 3. Figure 11. The stacked bars show the numbers of existing wells from the DrillingInfo dataset in 2019, and the potential wells from the EIA report [6] in each Bakken TPS assessment unit mapped onto the corresponding counties in the Middle Bakken and Upper Three Forks Formations. The lines show the corresponding EUR values from the physical scaling and the EIA statistical predictions. The P 10 , P 50 , and P 90 wells were obtained by applying the generalized extreme value (GEV) distribution for each set of EURs from the scaling curve method in each sub-region [34]. The two-letter codes abbreviate the five Bakken assessment units detailed in Appendix A: Central Basin (CB), Eastern Transitional (ET), Nesson-Little Knife (NL), and Northwest Transitional (NT).
In most cases, where there are enough wells per sample, the expected P 50 EURs from physical scaling agree with the EUR estimates by EIA. One can observe that most of the best regions with the high EUR values have been extensively drilled leaving no or few potential wells to infill. Mountrail-NL is an extreme example of over-drilling in the best region of the Bakken, where the number of existing wells exceeds the number of EIA potential wells by 40% (the green bar is taller than the gray bar).
In another paper [34], we have calculated well density of all regions in the Bakken, revealing a surprising result that the core area of the Bakken will be fully drilled by 2021. In contrast, most of the 85,334 potential wells proposed by EIA will be in the poorest regions of the Bakken. Figure 12 maps the EUR results from physical scaling approach for 14,888 horizontal oil wells in the Bakken. Notice that this map resembles more the water cut map (Figure A3), rather than the GOR map ( Figure A3).

Figure 12.
A map of EURs obtained from the universal scaling curve method for 14,888 horizontal wells in Bakken shale. Notice that this map resembles the water cut map, rather than the GOR map. This indicates that EUR is negatively correlated with water cut.
In Figure 13a, we show that EUR is negatively correlated with water cut (1 − oil cut). On the other hand, there is little correlation between EUR and lateral length, as depicted in Figure 13b. Doubling lateral length from 1.5 to 3 km (5000 to 10,000 ft) does not double average EUR that increases from 38 to 49 thousands of m 3 (240 to 310 kbbl). This finding may also suggest that better technology cannot increase EUR if a well is drilled in a low-productivity region of the Bakken.  Figure 14 shows the average values of pressure interference time, τ, lateral length, gas-oil ratio, and water cut for each six-month time interval in calendar dates. The value of τ decreases significantly over time meaning that newer wells pressure-interfere and decline faster than the old ones. Equation (7) shows that τ is directly proportional to the square of half-distance, d, between two consecutive hydrofractures. This means that halving the interfracture distance decreases τ four-fold. Historically, the number of hydrofracture stages in the Bakken has increased over time from 8 stages in 2007 [52], to 18 stages in 2009 [53], 35 stages in 2016 [54], and to as many as 60 stages in 2019 [55]. The lateral length has also doubled from 1.5 km (5000 ft) in 2005 to 3 km (10,000 ft) in 2015. In summary, the new completion techniques in the Bakken shale rely on more of bigger hydraulic fractures and on longer laterals [49][50][51]. These new completions have significant side-effects. Cumulative gas-oil ratio has been increasing in both the Middle Bakken and Three Forks. The likely culprits are reservoir depressuring by hydrofractures that are too close and the increasingly aggressive artificial lift. Degassing the low permeability and porosity reservoirs in the Bakken spells trouble for future production, because solution-and free gas are the main driving mechanisms of primary depletion. Water cut has also increased with time revealing another negative side-effect of the recent Bakken field development. The increasing water cut might be caused by drilling new low-productivity wells in the immature regions of Bakken [57][58][59], and/or by the longer and taller hydraulic fractures that break into adjacent water-bearing formations, the Lodgepole and/or Lower Three Forks [60][61][62].

Conclusions
We have delivered a comprehensive physics-based scaling of all horizontal hydrofractured wells completed in the Middle Bakken and Upper Three Forks between 1 January 2000, and 1 May 2019. In particular:

•
The current 14,888 active oil wells in the Bakken shale will ultimately produce 715 million m 3 In general, wells completed in the Middle Bakken produce more oil than those in the Upper Three Forks due to: (1) lower water saturation and water cut, (2) slower decline rate (longer pressure interference times, τ), and (3) higher initial oil in place (larger M). • Newly completed wells start from very high initial oil rates but in general decline faster than the pre-2010 wells. Still, we predict higher EURs for the newly completed wells.
• The more productive newer wells result from recent advancements in completion technology: longer laterals, larger hydrofractures, bigger stimulated reservoir volumes, and more fracture stages.

•
Operators have also learned to drill new wells only in the most prolific area of the Bakken region at the center of the Williston Basin.

•
With time, negative trends in oil production have amplified in the Bakken. We observe higher GOR values (reservoir degassing); higher water cuts (contacting water-bearing formations and drilling in regions with lower oil saturations); faster decline rates; and excessive well density, especially in the most prolific areas.

•
When all most productive areas in the Bakken are extensively drilled, the poor immature areas with high water production will be the only targets left for infill drilling. In this case, technology will not enhance much performance of the future wells.

•
We encourage practitioners to adopt our fast and accurate method of predicting oil production in shales that is a viable alternative to the hyperbolic DCA which yields an 'illusory picture' of shale oil resources.
Elsewhere [34], we have outlined a hybrid method that combines the precise GEV statistics of horizontal oil wells in the Bakken with the physics-based scaling method developed in this paper and combined with late-time inflow [26]. This hybrid approach follows from our comprehensive evaluation of the Barnett play futures [27].

Appendix A. Overview of the Bakken Shale Play
The Bakken mudrock play was discovered in 1953, in the Antelope Field, North Dakota, with the completion of the #1 Woodrow Starr well by Stanolind O&G Corp. [63]. Outside of Antelope field, the Bakken formation was not considered to be economically viable because it was generally impermeable [63,64]. The success of old vertical wells drilled in the Antelope Field depended upon naturals in the Bakken and Three Forks (Sanish) Formations [64].
The Bakken was not developed until the early 2000s, when about 20,000 barrels of oil per day were produced from the Elm Coulee field in Montana. Improvements of horizontal drilling techniques and advancements in hydraulic fracturing enabled massive drilling in the Bakken in the past ten years. By 2014, more than 10,000 horizontal wells were drilled in 14 counties in Montana and North Dakota, leading to the first production peak in the Bakken at 0.19 million m 3 /day (1.22 Mbopd). In early 2019, Bakken ranked as the second largest oil producer in the US, with the second major oil peak of 0.23 million m 3 /day (1.45 Mbopd) from 14,888 horizontal, hydrofractured oil wells. The predicted technically recoverable oil in the Bakken region is about 1.18 billion m 3 (7.38 billion barrels) of oil according to the US Geological Survey [4], and 15.6 billion barrels of oil according to the US DOE Energy Information Administration [6].   Table A1.
The average formation depth of all layers is about 3108 m (10,120 ft) . At this depth, the initial formation pressure and temperature are about 37 Mpa (5300 psia) and 105 • C (220 • F), respectively. Because the Upper and Lower Bakken formations are shales, the gamma-ray log readings there are as high as 690 • API and the permeability is as low as 2 × 10 −19 m 2 (200 nanodarcy). The main drilling targets, Middle Bakken and Upper Three Forks, have a better permeability of about 5 × 10 −17 m 2 (45 microdarcy). The Upper Three Forks formation has a slightly higher porosity and thickness than those in the Middle Bakken, but it also has a higher water saturation, indicating that the Three Forks wells must produce more water. (b) Figure A2. Bakken well log data of well-33-055-XXXXX (a) and well-33-053-XXXXX (b) from the North Dakota Department of Mineral Resources-Oil and Gas Division home page. Table A1. The average reservoir properties from well log data shown in Figure A2.  [4].

Formation Upper Bakken MiddleBakken LowerBakken Upper Three Forks
About 75% of all active wells are in the Nesson-Little Knife and the Central Basin, the two most productive AUs in the Bakken region. By mid-2019, 14,888 horizontal wells were completed in two main reservoir layers: the Middle Bakken (9894 wells) and the Upper Three Forks (4994 wells). The maps of gas-oil ratio (GOR) and water cut (WC) in these two main layers are shown in Figure A3a,b. The high GOR wells are the red dots at the center, which is the most productive part of the Williston Basin. In the Bakken, operators are producing more oil than water in the Nesson anticline area, Parshall Field, and Elm Coulee Field. In the Three Forks formation water cut values are higher than those in the Bakken formation.
The GOR and WC contours may reflect maturity levels of the Bakken shale members and their productivity. The higher the GOR value, the more gas is dissolved in oil, giving more solution gas/compaction drive energy that increases oil recovery [28]. On the other hand, water cut is the ratio between water and total liquid production rates. WC is proportional to water saturation in the reservoir. A mature shale formation tends to have a low water cut and high GOR. The center of the basin seems to be the most productive region with a perfect combination of high GOR and low water cut. This particular area has the highest well density (the number of wells per square mile) that shows just how aggressive oil production strategy is in this most prolific part of the Bakken shale [34].