M O B J E C T I V I S T: Production as Discovery?

In the comments section to the dispersive oil discovery model post, Khebab applied the equation to USA data. As the model should scale from global down to distinct regions, these kinds of analyses provide a good test to the validity of the model.

In particular, Khebab concentrated on the data near the peak position to ostensibly try to figure out the potential effects of reserve growth on reported discoveries. He generated a very interesting preliminary result which deserves careful consideration (if Khebab does not pursue this further, I definitely will). In any case, it definitely got me going to investigate data from some fresh perspectives.

After grinding away for awhile on the available USA production and discovery data, I noticed that over the larger range of USA discoveries, i.e. inferring from production back to 1859, the general profile for yearly discoveries would not affect the production profile that much on a semi-log plot. The shock model extraction model to first order shifts the discovery curve and broadens/scales the peak shape a bit -- something fairly well understood if you consider that the shock model acts like a phase-shifting IIR filter. So on a whim, and figuring that we may have a good empirical result, I tried fitting the USA production data to the dispersive discovery model, bypassing the shock model response.

I used the USA production data from EIA which extends back to 1859 and to the first recorded production out of Titusville, PA of 2000 barrels (see for timeline). I plotted this on a semi-log plot to cover the substantial dynamic range in the data.

This curve used the n=6 equation, an initial t_0 of 1838, a value for k of 0.0000215 (in units of 1000 barrels to match EIA), and a D_d of 260 GB.

D(t) = kt⁶*(1-exp(-D_d/kt⁶))
dD(t)/dt = 6kt⁵*(1-exp(-D_d/kt⁶)*(1+D_d/kt⁶))

The peak appears right around 1971. I essentially set P(t) = dD(t)/dt as the model curve.

I find this result very intriguing because, with just a few parameters, we can effectively fit the range of oil production over 3 orders of magnitude, hit the peak position, produce an arguable t_0 (thanks Khebab for this insight), and actually generate a predictive down-slope for the out-years. Even the only point that doesn't fit on the curve, the initial year's data from Drake's well, figures somewhere in the ballpark considering this strike arose from a purely discrete and deterministic draw from the larger context of a stochastic model.

(I nicked this figure off of an honors thesis, look at the date of the reference!)

Stuart Staniford of TOD originally tried to fit the curve on a semi-log plot, and had some arguable success with a Gaussian fit. Over the dynamic range, it fit much better than a logistic, but unfortunately did not nail the peak position and didn't appear to predict future production. The gaussian also did not make much sense apart from some hand-wavy central limit theorem considerations.

Even before Staniford, King Hubbert gave the semi-log fit a try and perhaps mistakenly saw an exponential increase in production from a portion of the curve -- something that I would consider a coincidental flat part in the power-law growth curve.

At the moment, I would not of course toss the shock model, as it accurately reflects the shift from peak discovery to peak production in addition to modelling subtle production variations, but this discovery/production heuristic looks promising. Can you imagine what this does to the HL fitting approach? Stay tuned.

2 Comments:

Professor Khebab said...: It's a strange result but in my opinion there is a lot more in the Shock Model compared to a simple curve fitting: 1) The shifting and spreading of the initial discovery strikes; 2) the fluctuations of the extraction rate capturing oil shocks. As a first approximation, we can say that the oil production profile is a simple time shift, or a linear transformation, of the dispersive oil discovery curve (i.e. oil discovery + reserve growth) so the latter can be used as a parametric model for a curve fitting approach.

You said:
I noticed that over the larger range of USA discoveries, i.e. inferring from production back to 1859, the general profile for yearly discoveries would not affect the production profile that much on a semi-log plot
Can you elaborate a little more?; 7:29 PM
Professor @whut said...: Khebab,
I think you essentially echoed what I said in the post above: "The shock model extraction model to first order shifts the discovery curve and broadens/scales the peak shape a bit -- something fairly well understood if you consider that the shock model acts like a phase-shifting IIR filter."

So I agree with your point 1).

I also said "At the moment, I would not of course toss the shock model, as it accurately reflects the shift from peak discovery to peak production in addition to modelling subtle production variations"

Which agrees with your point 2). What you refer to as fluctuations, I called subtle production variations, which is perhaps too euphemistic a description.

As for the last point, to elaborate a bit more, I would suggest that anything that changes by several orders of magnitude, from a signal processing standpoint will be affected less by a temporal filter than by the scale of the magnitude change itself. So the magnitude change swamps out the oil shock filtering, leaving the general profile approximately the same, over this much larger range (both in magnitude and time).

I was tempted to demonstrate this equivalence with the two curves, production and discovery, such that they were rescaled and shifted to lay on top of each other. This basically lead to my thinking of the discovery curve as a simpler heuristic, which could be used to model production in a pinch -- since it uses a very concise formula.

So basically, I agree with everything you said -- if I interpreted your concerns correctly.; 8:23 PM