[[ Check out my Wordpress blog Context/Earth for environmental and energy topics tied together in a semantic web framework ]]

## Saturday, April 24, 2010

### Extracting the Learning Curve in Labor Productivity Statistics

The term ergodicity refers to the uniformity and fairness in occupation of a system of probability-based states. In the way I think about it, an ergodic Tic-Tac-Toe distribution would have gained enough statistics such that the average measured occupancy of squares of an averaged game's end would equal the probability of some theoretically predicted occupancy.

One can get to this state either by capturing statistics over a long period of time or having some process that doesn't have statistical runs that break the stationarity principle. I sometimes confuse the term stationary with ergodic. The first has more to do with suggesting that a particular snapshot in time does not differ from another snapshot at some later time (i.e. independent events). The latter makes certain that the process has enough variety in its trajectories to visit all the possible states.

One of the first mathematical experiments I remember working in school had to do with random number generation. At the time, hand-held calculators still held some excitement and to take advantage of this, the instructor gave the students an assignment to come up with their own uniform random number generator by creating some complicated algorithm on the calculator. The student could then easily test his results.

As I recall, I felt a certain amount of pride in the clever way that I could get results that appeared random. I don't think I understood what pseudo-random meant at the time. In retrospect, I probably had enough knowledge to realize the power of the inverse trig functions and how they could generate numbers between zero and one, and of the idea of truncating to the decimal part of the number (to get a value between 0 and 1). I am sure that my classroom random number generator would not pass any quality tests, but it worked well enough for learning purposes.

I mention this because I also think it first gave me an idea of how easy one can enter into a very disordered state, and one of almost "uniform randomness". It really does not take much in terms of a combination of linear or non-linear calculations before one can obtain a well-travelled set of trajectories and thus an ergodic distribution that also seems to meet the maximum entropy principle.

The post on Japanese labor productivity statistics rests on this same assumption. Enough variety exists in the set of Japanese labor pool for the statistics to reach an ergodic level. And enough complexity exists in the ways that the laborers operate that they will visit all the labor productivity states possible considering the constraints. That partly explains why I can get confidence that some deep fundamental stochastic behavior explains the results.

The labor productivity model that I used to generate the distribution involved the application of a non-linear Lanchester model for competition. The non-linear nature of this model at least contributes to the potential for the states to reach an ergodic limit. Much like a complex expression invoked on a calculator allows one to generate quite easily a pseudo-random distribution.

In the parlance of attractors, these equations naturally translate into a scenario where you can imagine all sorts of possible trajectories to occur:
Unfortunately, this does not help too much with intuition. Yes one can execute a bunch of random calculations and note how easily various states accumulate, but the non-linear math still gets in the way.

For that reason, I will create an alternate model of labor productivity that invokes similar math but relies on the more intuitive concept of a "learning curve". In practice, a learning curve can exist where a worker can pick up much of the rudimentary skills very quickly, yet to get that last level of productivity will often take much more time [1].

Call the labor productivity C(t) and note that it has some minimum (Max) and maximum level (Min) based on the basic minimum requirements of the company and on the maximum technically achievable productivity.

Then over the course of time, a new hire may see productivity rise according to this simple relationship:
dC(t) = k/(C(t) + v) * dt
This mathematical relationship says that the increment of productivity (dC) per unit of time (dt) is inversely proportional to the productivity accumulated so far. In this case productivity equates to skill level. Thus, the worker can learn the early skills very easily, where the accumulated skills have not risen to a high level. (The parameter v prevents the learning curve rate from going to infinity initially). However, as time advances and these skills accumulate, the growth of new skills starts to slow and it reaches something of an asymptotic limit. We can consider this a law of diminishing returns as the weight of the accumulated skills starts to weigh down the worker.

Rearranging the equation, we get:
(C + v) dC = k dt
We can integrate this to get the result:
½C2 + vC= t + constant
This results in the same constraint relationship that I used in the previous post in determining P(C) according to the dispersion formulation for C. The equation if given constrained limits (both Max and Min values) has a solution according to the basic quadratic formula which I used for the Monte Carlo simulation. Otherwise we simply use the ergodic view that all the various values of t will get visited over time, and all the possible constant values will show up according to maximum entropy.

The following figure shows a single instance of a labor productivity learning curve. This has a minimum level and a maximum level at which productivity clamps to. Imagine a set of many of these curves, all with different quadratic slope and maximum constraint and that turns into the statistical mechanics of the labor productivity distribution function for P(C) that we see aggregated over all possible learning times.

which leads to this density function and the excellent fit to the data when entropic dispersion gets applied.

This essentially gives an alternate explanation as a learning curve problem for the excellent fit we get for labor productivity. All workers go through a learning curve that shows a minimum proficiency and a maximum productivity that clamps the level, in between we see the quadratic solution growth which shows up as the inverse power law of 2 in the labor productivity distribution function.

To have it make sense with the Lanchester model labor model, the warfare between firms competing for the same resources provide further elements of disorder. Workers switching firms can cause labor productivity to clamp as progress stalls. By the same token, the cross-pollination of worker skills between firms and of compounding growth adds to the dispersion in the growth rates.

[1] Also see Fick's diffusion equation or a random walk, which shows the same quasi-asympotic properties.