[[ Check out my Wordpress blog Context/Earth for environmental and energy topics tied together in a semantic web framework ]]

Tuesday, January 12, 2010

Odds and Uncertainty

In writings and correspondence, the scientist E.T. Jaynes took a feisty approach to defending his ideas and challenging the status quo. Known best for relating entropy and probability to many areas of science and information technology, Jaynes in particular took on the proponents of the classical statistics school, known as the "frequentists". Although he did not necessarily disparage their work, he could never understand why the classical statisticians had such difficulty embracing alternate ideas, such as those coming from the Bayesian perspective. The "probabilistic" school (which included Jaynes) continued to make great practical strides in solving many thorny physics problems, yet the frequentists resisted the idea that Bayesian approaches could effectively subsume their doctrine. Jaynes showed in fact that ideas from probability could encompass some classical statistics ideas, going so far as to provocatively labeling probability theory as The Logic of Science
"Probability Theory is nothing but common sense reduced to calculation" -- Pierre-Simon Laplace
Jaynes has described how the mathematician Laplace had worked out many of the fundamental probability ideas a couple of hundred years ago (Jaynes lived in the 20th century and Laplace in the 18th), yet became marginalized by a few petty arguments. One of the infamous arguments Laplace offered, the Sunrise problem, has since supplied ammunition for opponents of Bayesian ideas over the years. In this example, Laplace essentially placed into quantitative terms the probability that the sun would rise tomorrow based on the count of how many times it had risen in the past. We can categorize this approach as Laplace's precursor of Bayes' rule, originally known as the rule of succession. In current terms we consider this a straightforward Bayesian (or Bayes-Laplace) update, a commonplace approach among empirical scientists and engineers who want to discern or predict trends. Yet, legions of mathematicians Laplace ridiculed for years since his rule did not promote much certainty in the fact that the sun would indeed rise tomorrow if we input numbers naively. Instead of resulting in a probability of unity (i.e. absolute certainty), Laplace's law could give numbers such as 0.99 or 0.999 depending on the number of preceding days included in the prior observations. Others scoffed at this notion because it certainly did not follow any scientific principle, yet Laplace had also placed a firm warning to use strong scientific evidence when appropriate. In many of his writings, Jaynes has defended Laplace by pointing out this caveat, and decried the fact that no one heeded Laplace's advice. For many years hence science had missed out on some very important important ideas relating to representing uncertainty in data.

My own views align with the way that Jaynes thinks, especially in how to apply probability arguments. We inhabit a world rife with uncertainty and disorder. In some cases, such as in the world of statistical mechanics, one finds that predictable behavior can arise out of a largely disordered state space; Jaynes essentially reinterpreted statistical mechanics as an inferencing argument, basing it on incomplete information on the amount of order within a system.

In the oil world we know many of the cause-effect relationships (people need oil so oil gets depleted, etc), but we don't understand the evolution quantitatively and how much order versus randomness plays into the behavior. These missing pieces of data together with the lack of a good quantitative understanding motivates my attempts at arriving at some fundamental depletion models.
'The Result of any Transformation imposed on the Experimental Data shall incorporate and be Consistent with all Relevant Data and be maximally non-committal with regard to Unavailable Data"
-- The First Principle of Data Reduction (due to Ables in 1974).
Jaynes spent much time understanding how to apply the Maximum Entropy Principle to various problems. I happen to use the MaxEnt principle with regards to oil because I personally don't have access to all the oil production and discovery numbers apparently available. The approach works quite effectively for me in other application areas as well and I know I can rely on it in many future situations.
"Science is fully justified in finding some relation between these fields only after the equality of mathematical methods has been reduced to an equality of the real nature of the concepts." -- A. Einstein.
These ideas have such a fundamental basis that you really need to stretch to discredit them, essentially making MaxEnt very much an Occam's razor argument. Only a non-obvious new physics explanation could ever displace the obvious choice.
"Any success that the theory has, makes it useful in an engineering sense, as an instrument for prediction. But any failures which we might find would be far more valuable to us, because they would disclose new laws of physics. You can't lose either way." -- E.T. Jaynes
What I find surprising is that other oil depletion analysts haven't caught on to this approach yet. Mobil Oil actually published one of the early classic Jaynes texts based on a symposium they held in their research labs:

Similarly, academic geophysicists such as J.P. Burg have used Jaynes ideas to great effect. Burg essentially derived the approach known as Maximum Entropy Spectral Analysis. Not limited to geophysics, this technique for uncovering a signal buried in noise has become quite generally applied.

This leaves us with the curious situation that the petroleum and geology fields contributed a huge early interest in the field of MaxEnt but never carried this forward. Jaynes has often pointed out that some of the applications are so straightforward that a robot, given only the fundamental probability rules, could figure out the solution to many of these problems -- presumably oil depletion included.
"We're not going to ask the theory to predict everything a system could do. We're going to ask, is it possible that this theory might predict experimentally reproducible phenomena" -- E.T. Jaynes
Jaynes has said that thinking about maximizing entropy parallels the idea that you place your bets on the situation that can happen in the greatest number of ways. Then because enough events and situations occur over the course of time, we end up with something that closely emulates what we observe.
"Entropy is the amount of uncertainty in a probability distribution" -- E.T. Jaynes
Yet, we don't get something for nothing, so we still have to guess at the underlying probability distribution. This sounds hard to do but, the basic rules for maximizing entropy only assume the constraints; so that includes things like assuming the mean or the data interval.
"No matter how profound your mathematics is, if you hope to come out with a probability distribution, then some place you have to put in a probability distribution" -- E.T. Jaynes
Given all that as motivation, we can look at oil reservoir field sizes and see what other ideas shake out. Based on a MaxEnt of the aggregation of reservoirs over time, I previously came up with the following cumulative probability distribution for field sizes:
P(Size) = 1/(1+C/Size)
In terms of odds, we can rearrange this formulation by using the definition of odds for, Odds=P/(1-P), or odds against, Odds=(1-P)/P. Then the odds of finding a reservoir larger than a certain size, assuming we randomly pick from the sample population is
Odds(Size) = C/Size
This simple result gives us some great insight. It essentially tells us that the greater the size of the reservoir desired, the progressively smaller the odds that we would come across at least that size. For the USA, the value of C comes out less than 1 million barrels, so that finding a field of at least 10,000 MB is 1:10,000. This assumes that we randomly draw from a sample of newly discovered fields.

On the other hand, if we want the odds of drawing from the sample and expecting at least a 1 MB field, we put in the formula and get 1:1, or basically even odds. So if we want to somehow maintain our current rate of domestic production by placing safe bets, we have to find an awful lot of small reservoirs.

We could also place our bets on the long payoff but need to realize that the probability size distribution starts to asymptotically limit for large sizes (see this post) and the odds factor blows up
P(Size) = Size*(L+C)/(Size+C)/L
as the odds does this
Odds(Size) = (Size+C)*L/(Size*(L+C)) - 1
This gives similar odds for a small reservoir, still close to 1:1, but the odds for getting a large reservoir no longer scale. For example, if we use a max size L of 20,000 MB, then the odds of a size of 10,000 MB is one half the odds without the maximum size. And the odds for getting anything bigger than 20,000 MB become essentially 1 in infinity.

This all comes about from assuming a maximum entropy distribution on the accumulation of the reservoirs and then applying a constraint on the time that these reservoirs accumulate. As Jaynes said, we can do quite a bit with incomplete information.

The same arguments apply to the dispersive discovery model which places fixed limits on the cumulative production based on similar incomplete information. Why the guys at Mobil Oil never figured this out, we will never know. Who knows .. they also could have either known about this approach at some point and never wanted to disseminate the information to the masses, or never cared and focused strictly on the bottom line.

King Hubbert clearly never applied any of Jaynes' principles, except perhaps at some deep intuitive level. But as Jaynes himself might have concluded, that would have been okay since all the intent of probability theory has always been to place quantitative terms on human insight. So Hubbert gave us some of the insight, and the rest of the probability-based models, such as dispersive discovery and the oil shock model provides the mathematical foundation.

We may fear change, but we should not fear uncertainty .. at least when it comes to math.

Perhaps you can consider this argument fairly reasoned and contributing to a scientific understanding of the oil depletion phenomenon. I know that I have no funding from Mobil or anywhere else for that matter, but I hold on to the hope that the models can make a difference. However, disconcerting in my mind is that another rag-tag group of analysts think they can accomplish a similar effect in denying climate change. Christopher Essex had this to say to the WSJ:
Science is alive and well in the individual scientists who are not caught up in gaming the system for bigger grants. I call it small science. Many of them are doing very unfashionable things, and are happy to get no recognition for it.

That is where you can find the real scientists. That is where the future will be.
Good sentiment, but too bad that the clown Essex wrote this -- the mathematician who thought that temperatures can not average. I hammered his book when it came out almost 6 years ago, and find it embarrassing that I share the same attitude as him with respect to "small science". The difference is in the agenda; I have no agenda but McIntyre and Essex and the other deniers have it in spades.


Post a Comment

<< Home

"Like strange bulldogs sniffing each other's butts, you could sense wariness from both sides"