Wednesday, September 23, 2009

Random Mutation

In their paper Conservation of Information in Search: Measuring the Cost of Success William Dembski and Robert Marks present an algorithm which resembles Dawkins's weasel (the algorithm I talked about before - at length :-)
It isn't the algorithm Partioned Search introduced on p. 1055, it's the algorithm Random Mutation on p. 1056. The differences
  1. the alphabet consists of two letters, not 27.
  2. the target length is 100, not 28
  3. the size of a generation is quite low: it's two
  4. the mutation probability per letter is very small: µ = 0.00005
Many programmers have implemented weasel-like programs, and the runs I've seen usually use a µ between 0.04 and 0.05, while the generation size is greater than 20.

So, why do W. Dembski and R. Marks use their values? Well, in the appendix of their paper, they calculated an approximation for the number of correct letters depending on the number of generations. This approximation discards terms quadratic in µ, so to use it, not only has µ << 1, but the size of a generation has to be small, too.

But should they use these values? I don't think so: As usual, I modeled their algorithm as a Markov chain and could calculate some exact values. The expected number of generations to get to the target for their choice of parameters is 55,500. (see pic 1)

Some fooling around with their program should have shown that this number can be reduced dramatically by using an more appropriate value for µ. The optimum lies near to µ ~ 0.0005, which takes only 10,600 generations.


And though their approximation doesn't work as fantastically as for the parameter ten times smaller, the numbers are not more than 2.5% off. (see pic 2)

No comments:

Post a Comment