M O B J E C T I V I S T: The Role of Dispersive Discovery in Reserve Growth

Shorter post: Enigma Solved

"We must accept finite disappointment, but we must never lose infinite hope." -- Martin Luther King Jr.

"Reserving judgments is a matter of infinite hope." -- F. Scott Fitzgerald in The Great Gatsby

We have on our hands a huge swindle pertaining to the reserve growth analysis promulgated by USGS geologists. In turns, the establishment has labeled the fossil fuel reserve growth issue an "enigma" or a "puzzle".

For that reason the United States Geological Survey (USGS) considers [this] analysis "arguably the most significant research problem in the field of hydrocarbon resources assessment."

I believe that the big technical issue that the USGS have historically had with their analysis has to do with using "censored data". This essentially says that you should take special care of extrapolating data backwards considering you have only a truncated time-series data set. But after more consideration, the problem has become even more painfully obvious. It really stems from a lack of a good value for the initial discovery estimate.

To lead me down this path, I used my generalized Dispersive Discovery Model. In terms of modeling reserve growth, the dispersion generates a tail for accumulating further discoveries after the initial estimate. For constant average growth, the model looks like this:

DD(t) = 1 / (1/L + 1/kt)

Note that at time t=0, the discovered amount starts at zero and then the accumulated reaches some value proportional to L -- what one should consider as the characteristic depth of the reservoir. The basic premise of reserve growth and what USGS geologists such as Attanasi&Root¹ and Verma² frame their arguments on, has to do with the reserve growth considered as a multiplicative factor of the initial estimate. They see numbers that reach a value of 10x after 90 years and claim that this has some real physical significance, almost offering up hope for still-to-come huge reserve benefits.

The swindle with all this has to do with when the original estimate is made. Conceivably you can make estimates that occur very early in the lifespan of a reservoir, and you will get very low estimates for estimated discovery size. You might find the initial estimate to fill a sewing thimble. Now if that estimate grows at all, you can get huge apparent reserve growth factors, some fraction approaching infinite in fact. In contrast, you can wait a couple of years and then report the data. The later years' growth factor will be proportionately much less. Now if you consider that in other parts of the world, countries report reserves less conservatively then the USA, then the reserve growth factors can vary even more wildly.

I used USA data from an Attanasi & Root paper¹ which you can find a dump of here. Initially, I plotted the data as a fractional yearly growth curve:

The key insight to understand the growth factor in terms of the DD model has to do with averaging the initial discovery point over a relatively small window of time starting from t=0. This effectively samples the infinite values of growth against other finite values. The use of the sampling/integration window brings down the potentially infinite (or at least very large) growth factor to something more realistic.

The math on this derives easily into an analytic form, and we end up with this strange-looking function, where A indicates the integration window:

U(t) = exp( ((A+t)*ln(A+t) - t*ln(t) - A) / A )

It turns out that the value for characteristic depth L does not even play into the result, as long as it gets set to some relatively large number. The value for the integration window A does not play a big role either, as it serves mainly to avoid generating a singularity and turning it back into the original DD formula.

In terms of a spreadsheet, this turns into a discrete generating function, with the yearly estimates based on the growth factor of the preceding year. With absolutely no fudge factors, I plotted the curve directly against the A&R data below. After the hairs on the back of my neck settled down, I realized that this function has some type of fundamental golden ratio property. It essentially generates growth factors based solely on the maximum entropy dispersion in the underlying model. In other words, the "enigmatic" reserve growth has turned from a puzzle into a mathematical curiosity resulting solely from simple stochastic effects.

Plotting against an Attanasi & Root figure, it lays cleanly on top of it, showing discrepancies only on some very old outlier data.

Interestingly, the reserve growth looks like it will continue to reach infinite values, but this turns out to stem solely from the possibility of "thimble"-sized initial estimates. As a bottom-line, if we continue to make poor initial estimates for discoveries, we will continue to pay the price for acting surprised at the "huge" reserve growth we have. In other words, the swindle has played out in our heads.

If you look back in the literature, you find hints that support the dispersive effects of accumulated reservoir estimates (note that they use the term dispersion).

A graphic illustration of the very broad URA data dispersion that occurs when grouping fields across geologic types and geographic areas was provided by the National Petroleum Council (NPC) and is reproduced with minor modification in Figure FE5.

I peaked back at the previous "model" that the USGS's Verma postulated² based on the "modified Arrington" approach and realize that this comes about purely from heuristic considerations.

CGF=1.7378(YSD)^0.3152

You have to ask yourself how these professionals get away with publishing stuff based on hacking and speculation that instead has such a simple statistical and mathematical foundation. I really find nothing complicated about the mathematics (even though it has taken me some time to arrive at my current state of understanding). I call this combination of using simple models and using straightforward calculus and probabilities a form of pragmathematics -- just something you do to understand the physical foundation for the data we observe.

The actual enigma of reserve growth I think has to do with the cluelessness of the USGS and the secrecy and inscrutability of the oil industry. You would think they would have figured out the reserve growth puzzle long ago. I guess they thought that deferring the reality would surely provide us with infinite hope. As King would likely offer, we just have to look elsewhere; certainly BigMrBossman won't provide any guidance.

References:
¹Attanasi, E.D., and Root, D.H., 1994, The enigma of oil and gas field growth: American Association of Petroleum Geologists Bulletin, v. 78, no. 3, p. 321-332.
² Verma, M.K., 2003, Modified Arrington Method for Calculating Reserve Growth — A New Model for United States Oil and Gas Fields, U.S. Department of the Interior, U.S. Geological Survey

Update:
Google Spreadsheet featuring full formula here: http://spreadsheets.google.com/pub?key=prVeQf4uJHD1AJVF_pGg9Jw

Update 2:
As I find it virtually impossible to show math derivations with the Blogger editor, I pasted my derivation from a math markup tool below. Equation 3 shows the technique for averaging initial estimates over the time window A, transforming U into U-bar. Equation 5 describes the final result, where the constant C scales the reserve growth factor to start at 1. The term L/k essentially gives the shape of the curve, with smaller values relative to A giving a more horizontal asymptote.

5 Comments:

Professor Greyzone said...: In the United States at least, there are potentially large financial penalties from the SEC for overstating reserves. Someone at TOD called the initial estimates "conservative" and so far, I tend to agree. If I have someone holding an economic gun to my head for overstating my reserves, I am going to understate them to be sure I don't get in trouble. That this action in turn causes the appearance of large scale reserve growth is irrelevant to me (an oil company) so long as I survive.; 10:52 PM
Professor @whut said...: I see that you realize that overestimating the size of a reservoir has significant financial advantages. You can attract huge initial investment capital if you exaggerate a claim. But this can also attract charlatans. So I agree with you that the SEC put the stops on that practice and the oil companies tried to make their estimates as high as they could without overdoing it. But they still had no ideas on how to characterize their estimates (not smart? who knows?).

Bottom line, I believe the estimates are safe and non-predictive, but the underlying reality has to do with the dispersion of searches through the volume of a reservoir. This makes it potentially predictive. The simple model I use for reserve growth is complementary to what Andy Grove did in the 60's. Unfortunately, it is the next century, and some nobodies are finally figuring out what was staring them and the USGS in the face for years.

I also don't think that petroleum engineers have any control over marketing decisions. The "marketing" engineers were likely stooges of upper management. It's possible that oil industry engineers knew about this characterization I discovered all along but were superceded by the board's decisions.

It's fun to speculate on oil industry psychology, but right now I am serious about substantiating the model. I am sure we will get some more good feedback and this will become a solid model.; 5:24 AM
Professor Khebab said...: I've tried to reproduce your calculations shown in the spreadsheet, I don't get the same curve using:

exp ( ( (A+K*t)*log(A+K*t) + (L+K*t)*log(L+K*t) - K*t*log(K*t) - (A+K*t+L)*log(A+K*t+L) ) / A )

did you really use L=100,A=1 and K=1?

The closest result I got is using K=10 and A=1,L=100.

Also, your equation in your spreadsheet differs from Equation (5) but it seems to be only a scaling factor.

Thanks!; 7:49 AM
Professor @whut said...: Good catch. I realized that I have a different version stored on Google docs than the one that Google had "published". Curiously, the published version does not allow you to view the formulas.
I republished with the values of L=24, A=6, and K=1.
Also, there is an inconsistency with the "log" that I use in the title cell with the "ln" I used in the calculation cell. Please go with the natural log, as this is sloppiness on my part. I might have even swapped the wrong log in the formula at some point.
I definitely left this chart in an inconsistent state. Sorry.; 7:21 PM
Professor @whut said...: Khebab,
Another bit of insight that I have gained since I posted this calculation-- I don't believe that the "averaged" reserve growth is that important for uses such as convolution. For example, I contend that the unaveraged form could be just as useful for an HSM fit. The averaging essentially only acts as a short moving average on the end result.

I went through the averaging basically to compare against what the geologists have been doing, which unfortunately does not necessarily express the salient features of the result.; 8:09 PM