[[ Check out my Wordpress blog Context/Earth for environmental and energy topics tied together in a semantic web framework ]]

## Tuesday, December 06, 2005

### Hubbert Linearization

I discovered the dirty little secret behind Hubbert Linearization. The conventional wisdom basically states that plotting ` dU/dt/U ` versus `U`, where U refers to cumulative oil extracted, you can extrapolate a negatively sloped line that intercepts the axis at ultimately recoverable resources. Most analysts use the logistic curve or Verhulst equation to "prove" this limiting behavior. Whereas, in practice, any peak will do.

First take `dU/dt`, which gives the production rate. When plotted, this will give a Hubbert-like peak somewhere in its lifetime. Any somewhat symmetric peak when Taylor-series expanded about its center point looks like this -- an upside-down parabola:
`dU/dt = A (1 - k2(t - t0)2)`
And then, any cumulative production increase looks like this near the peak -- a linear trend upward:
`U = a (1 + b/a(t - t0))`
Then make the substitution for time shifted around `t0, T = t - t0`, and you get this relationship:
` dU/dt/U ~ (1+kT)(1-kT)/(1+b/aT)`
The two positively increasing terms in the numerator and denominator more or less cancel, and you get
` dU/dt/U ~ C (1-kT)`
Which basically gives the famed Hubbert linearization term. Unfortunately, it doesn't give one any insight other than proof that you can linearize an upside-down parabola. Big little deal. We need way more insight than this to make headway in our understanding of depletion. (cue in the Oil Shock Model).

Peak Oil and TOD commenter Khebab also had an interesting point a while ago concerning residual analysis from Hubbert Linearization. As I got reamed for not doing this recently, it pays to read what Khebab said (which I agree with, if you substitute U for Q in the following derivation):
I'm skeptical about the use of this method to present production data because the relative error doesn't seem to be distributed uniformly. The relative error in the log domain of the vertical ordinates according to the logistic model is the following:
`D(ln(aP/Q)) = Dk/k - DQ/Q x Q / (1 - Q)`

where D stands for the greek symbol Delta, `Dk/k` and `DQ/Q` are the relative errors on k and Q which can be presumed constant. The error behaves has following:

• Production start Q -> 0:
`D(ln(aP/Q)) = Dk/k`

• Production Q -> 1 (total URR has been extracted):
`D(ln(aP/Q)) = -infinity`

Because we are in log domain, `D(ln(aP/Q)) = -infinity` means that deviation around the asymptotic line will tend toward zero!

That's why, we observe these wild deviations around the line when production is starting whereas it seems to converge nicely when Q tend toward 1. This behavior can be misleading for an observer because it seems to reinforce that there is some inexorable mechanism at work pushing the production data around the line.
I checked the math on this, and it really gets you thinking about what data visualization expert Tufte says about graphing data in a biased fashion. That convergence on a continuously shrinking error acts like a laser beam and gives people the impression of an excellent fit that may have dubious value at best. Which explains my reluctance to do error analysis when the competition has issues of their own.