[[ Check out my Wordpress blog Context/Earth for environmental and energy topics tied together in a semantic web framework ]]

## Monday, August 18, 2008

### Pipes and the Oil Shock Model

No one outside of engineering seems to understand the usefulness of mathematical convolution. To me, it seems a natural operation, as obvious as other more well known statistical operators such as auto-correlation. Yet, you can't find convolution listed in the @function list of a spreadsheet program such as M\$ Excel. I have an interest in this because convolution remains at the heart of the Oil Shock Model.

The shock model essentially expresses an oil production curve as a series of temporal convolution operators acting on an initial oil discovery stimulus profile. Each convolution corresponds to a phase of the lifetime of an average discovery, expressed probabilistically to account for the spread in potential outcomes. So that the original discovery profile gets convolved initially by a damped exponential representing the range in fallow times. This output gets followed by a convolution of another profile representing the range in construction times. In turn, the output of this gets convolved into another range of maturation times and then into the final extraction rate profile.

I had previously posted a full program that internally generated the entire set of convolutions, but I finally relented and decided to simplify the concept and use the idea of data flow and short scripts to perhaps make the concept more accessible (and potentially more flexible). Via UNIX or a Windows command shell, one can use the operating system ideas of pipes and the standard input/output streams to generate a very simple instantiation of the Oil Shock Model in terms of a short script and data files.

The pipes in the title of this post offer both an abstraction as to what occurs in the physical world as well as a useful mathematical abstraction to enable us to solve the shock model.

A typical invocation would look something like this:
`cat discover.dat | conv fallow.dat | conv cons.dat | conv mature.dat | conv extract.dat`
Each one of the phases uses the standard output of the previous phase as a standard input via the operating system pipe command signified by the vertical bar "|". The term `conv` refers to a shell script or executable that takes the data piped into it and convolves it with data from a file given by the script's command line argument. The initial `cat` call reads from the discovery data file and pushes it into the first pipe.

In terms of a popular scripting language such as Ruby, the `conv` script looks like this:
`# conf.rb: # convolution of stdio input array against array from filedef conv(a, b)  length = a.length + b.length  for i in 0..length do    sum = 0.0    i.downto(0) do |j|      sum += a[i-j].to_f * b[j].to_f    end    puts sum  endendconv(STDIN.readlines, IO.readlines(ARGV))`
The last line essentially calls the defined convolution function with two arrays generated automatically by using Ruby's `readlines` file parsing call. So that for each line in the file, representing a year's worth of data, an array is generated both for the standard input data stream, as well as the command line file's data. In other words, "a=the input data" and "b=the convolution data".

Operationally, to call this file using the Ruby interpreter, one has to invoke it with something akin to `"ruby conv.rb file.dat"`. And the data in each of the profiles has to contain enough entries to cover the range of years that you desire to generate a production profile for. The convolution function takes care of the ranges automatically (i.e. the full convolution generates a range that covers the sum of the individual time ranges).

A typical data file for something with a 10 year damped normalized exponential profile would look like:
0.1
0.09048
0.08187
0.07408
....
The ... would go on for N number of lines corresponding to N years. Of course, the data files themselves we can easily generate through other tiny scripts. The UNIX shell has a command called the back-tick "`" which when invoked within the command-line can generate a script call in-place. This means that we have many convenient ways, including providing a lambda function to Ruby itself, to generate the data profiles needed by the convolution operator.

In general, I find nothing really complicated about the convolution operation and find it really a shame that we don't have this functionality built into the set of standard spreadsheet operators. So even though this alternative pipe approach looks incredibly simple in principle, enough people stay away from the command line that it will never achieve the popularity of a spreadsheet formulation. Something like Matlab of course has the convolution operator built-in but it costs big coin and caters to the engineering crowd (of course). Alas, for the moment we will have to satisfy ourselves with pipes.

Update:
I should have added the Shock function to the list of scripts. It essentially acts the same as a convolution, but since it relies on perturbations to a Markov (memoryless) process, we can only add it to the end of the sequence of convolutions. The file it works with contains a list of fractional extraction rates (acting proportionally to the current reserve) matched to the years since the first discovery occurred. For a stationary process, these rates stay relatively constant from year-to-year, but due to the possibility of global events, these rates can suddenly change, leading to sudden blips and dips in yearly production numbers.
The Shock script:
`# shock.rb: # Markov extraction of stdio data using perturbed rates from filedef shock(a, b)  length = b.length  temp = 0.0  for i in 0..length do    output = (a[i].to_f + temp) * b[i].to_f    temp = (a[i].to_f + temp) * (1.0 - b[i].to_f)    puts output  endendshock(STDIN.readlines, IO.readlines(ARGV))`

The extraction rate file would look like this:
0.1
0.1
0.1
0.12
0.1
....
The fourth entry shows a 20% upward bump in the extraction rate. The complete shock model invocation would thus look like this:
`cat discover.dat | conv fallow.dat | conv cons.dat | conv mature.dat | shock rate.dat`

Update 2:
The following script takes the standard input and applies a constant extraction rate to the accumulated reserve data. Notice how the convolution simplifies given a Markov approximation.
The un-Shocked script:
`# Markov of input array against arg valuedef exp(a, b)  length = a.length   temp = 0.0  for i in 0..length do    output = (a[i].to_f + temp) * b    temp = (a[i].to_f + temp) * (1.0 - b)    puts output  endendexp(STDIN.readlines, ARGV.to_f)`
This becomes a "shock"-less model and gets invoked as `"ruby exp.rb 0.1"`, where the argument becomes a single floating-point value instead of a file. The extraction extends for as long as the input data sustains itself, which means that you need to extrapolate the input data if you want to better extrapolate into the future. I suggest this as a useful technique for every one of the scripts.

All of these scripts generate a granularity of only one year so don't expect great results if the rates have time constants that get too close to one year. I would suggest that switching over to a smaller granularity than a year in this case; you just have to remember that the resultant output data will have this same granularity.

Professor Joshua Stults said...

I really like your simple model, I think the speed / readability could be improved if you went to FFT-based convolutions. Using pipes to chain the convolutions seems like a lot of overhead, is there some reason you aren't doing it all in the same process?

1:35 PM
Professor @whut said...

I agree that FFT would help as you stage more and more of these together, because it becomes a simple multiply between stages.

The key to the FFT approach is that you have to really think carefully to get the decimation in time correct up front.

9:31 AM
Professor Joshua Stults said...

Yeah, I've thought a bit about how to do a time-varying extraction rate with the FFT convolution approach, but haven't come up with a good / simple method yet.

9:57 AM
Professor @whut said...

Right, it is no longer a stationary process, with all that entails.

At least you can do the first N-1 convolutions with an FFT, and then do the last as a perturbed extraction rate.

9:35 PM
Professor @whut said...

BTW, Various Consequences is a great blog.

9:37 PM

"Like strange bulldogs sniffing each other's butts, you could sense wariness from both sides"