DiEbLog: August 2009

Wednesday, August 26, 2009

On Oracles and Weasels

It's once again about Richard Dawkins's weasel algorithm. In his book The Blind Watchmaker, Dawkins gives the following description.

Dawkins describes his algorithm in the following way:

It again begins by choosing a random sequence of 28 letters, just as before:

WDLTMNLT DTJBKWIRZREZLMQCO P

It now 'breeds from' this random phrase. It duplicates it repeatedly, but with a certain chance of random error - 'mutation' - in the copying. The computer examines the mutant nonsense phrases, the 'progeny' of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL. [...] the procedure is repeated.

In my previous posts, I explained how this can be transformed into a program:

chose random string
copy string n times with mutation. NOTE: at this step you don't know which letters are correct in place, so no letter is safe from being mutated!
chose best fitting string. NOTE: best fitting seems to be generally understood to be the string with the most correct letters, the fitting is expressed by a number between 0 and 28
Stop, if the number of correct letters is 28, otherwise
goto 2

Some people (including W. Dembski, I think) are baffled by step 3: seemingly, the program has to know the best string and can extract information about it. Therefore, I restate this algorithm slightly:

chose random string
copy string n times with mutation.
ask oracle which string it likes best
Stop, if the number of correct letters is 28, otherwise
goto 2

Evaluating a fitness function for an algorithm is like questioning an oracle. Even Robert Marks's and Richard Dembski's web site says so - though they don't think it through.

So, how does the oracle work in Dawkins's case? You present it with a list of strings and it just tells you which string is best ("The computer examines the mutant nonsense phrases, the 'progeny' of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL."). You don't have to know how the oracle deems one string better than the other: it could be that it prefers the string with the most letters correctly in place, or the one which shares the longest substring with its (hidden) target, or perhaps it looks how many letters are right from the begin of the phrase, preferring METHINKS IT IS FROGLIKE XXX to BETHINKS IT IS LIKE A WEASEL, or preferring correct vowels to correct consonants, it could be a combination of these on Fridays, and Saturdays something else....

Could Dembski's algorithm with such an oracle? No. Here is a his algorithm:

chose random string
ask oracle which letters are in place
if all letters are in place: STOP Otherwise:
mutate letters which are not in place
goto 1

What Dembski exploits is the knowledge about the inner processes of the oracle: but knowing how it works allows for a deterministic approach - the Hangman game - which terminates in a few steps...

White Screen & Denise O'Leary

My edits at Uncommon Descent obviously aren't welcomed anymore. What a pity: Now there are at two active threads over there discussing Dawkins's and Dembski's weasel algorithms (as described earlier in A Story of Two Weasels, The Latching Weasel, and Dawkins's weasel: it isn't that bad)

Denise O'Leary asked:

Make the code for the program public. Perhaps Richard Dawkins himself or his friends at RichardDawkins.net can finally provide this code (apparently a program written in BASIC).

I tried to enlighten her. After all, the program is now over 20 years old, and may be rotting in a heap of 5 1/4" floppies...

Dawkins describes his algorithm in the following way:

It again begins by choosing a random sequence of 28 letters, just as before:

WDLTMNLT DTJBKWIRZREZLMQCO P

It now 'breeds from' this random phrase. It duplicates it repeatedly, but with a certain chance of random error - 'mutation' - in the copying. The computer examines the mutant nonsense phrases, the 'progeny' of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL. [...] the procedure is repeated.

This is enough to replicate his program:
1. chose random string
2. copy string n times with mutation. NOTE: at this step you don't know which letters are correct in place, so no letter is safe from being mutated!
3. chose best fitting string. NOTE: best fitting seems to be generally understood to be the string with the most correct letters, the fitting is expressed by a number between 0 and 28
4. Stop, if the number of correct letters is 28, otherwise
5. goto 2

The parameters which you can chose is the number of copies n, and the probability that a letter in a string is mutated, p. You may even chose another procedure of mutation - but keep in mind the NOTE of step 2.

It's really basic to realize steps 1 - 5 in the programming language of your choice...

While the edit above made it through Uncommon Descent's gates, I'm not allowed to post the follow-up:

Just to elaborate: The interesting part is what Atom describes as oracle, i.e., the application of the fitness function. In Dawkins description we read

The computer examines the mutant nonsense phrases, the 'progeny' of the original phrase, and chooses the one which, however slightly, most resembles the target phrase

An oracle - a black box - which accomplishes this just hints to one string when presented with a population of strings. No further information has to be exchanged. It is not necessary to know how the oracle defines the best string: It could just hint you to the one with the greatest number of correct letters - or perhaps the one which shares the longest substring with the target phrase.

And that's rather a pity: W. Dembski is a guy who is interested in information, and the amount of information which is transferred from a environment into the algorithm. He should be interested in the flow of this information: How is this information exchanged?
The answer: Via the conversation with the oracle. Could his partition search algorithm work with an oracle a described above? No, as it is not enough to name the best string, the position of the correct letters in the string has to be named, too.
Does Dawkins's algorithm work with such an oracle? Obviously...

"Your comment is awaiting moderation...."

Perhaps ~~I should be grateful that I'm not blocked at Uncommon Descent~~. Update: Perhaps I should have been grateful that I was not blocked at Uncommon Descent.
But it's rather annoying to wait for a couple of hours to get your comment through moderation while a discussion is ongoing: my edit contributed at 2:52 am today (Aug 26, 2009) has still not appeared over six hours later, though there were fifteen other comments in the meantime. What is so hard in moderating the following?

I took the string
SCITAMROFN*IYRANOITULOVE*SAM
and calculated a next generation using Dawkins's algorithms with populations of 10,50 and 100 - and mutation rates of .04, .05 and .1. The tenth string in the list is the second generation given in the paper of Mark and Dembski. The differences with the first generation are in bold face:

1. SCITAMROFN*IYRANOIEULOVE*SAM
2. SCITAMROFN*IYRANOITULOGE*SAM
3. ECITAMRI*N*IYZANOITULOVE*SAM
4. SCITAMROFN*IYRANOITUL*VE*SAM
5. SCITAMROFN*IYRANOITULOVE*SEM
6. SCITAMOOLNOIYRAMOITULOVE*SEM
7. SCITANROFN*IYYANOITULOVE*SAM
8. SCITIMROFN*JYRANOITULOVE*SAM
9. SCITAMROFN*ICRHNOITSLOWE*SAV
10. OOT*DENGISEDESEHT*ERA*NETSIL

Can anyone spot a difference in the design of the strings? Anyone? KF? Anyone?

I'll stay tuned :-)

Update: Now I've two comments waiting in the queue:

232 - DiEb - 08/26/2009 - 9:39 am Your comment is awaiting moderation.
I try to get involved in the discussion, but my last edit (#213) is now in moderation for nearly seven hours…

Update: After eight hours, the comment got through. But I seem to be blocked from now on....

Tuesday, August 25, 2009

The Latching Weasel

I described Dawkins's ''weasel algorithm'' in a previous posts. Some claim that in the process the correct letters are ''explicitly or implicitly'' latched. That's not true, the algorithm isn't necessarily monotonously non-decreasing: a change to the worse from one generation to the next at least once while running the program is possible.
I calculated the probabilities for two case: a population of ten and a population of fifty - both times with a mutation rate of 4%.

10: 95.7 %
50: 0.0000026 %

With a small population, the effect will most probably be observed. But even with the bigger population, a programmer may spot the event - if he is lucky enough.

Why did I chose this population? A population of ten seems to be used by R. Dawkins in his video of 1987, while a population size of fifty was probably used for his book in 1986.
In the first case, it takes 1306 generations on average (σ=924), in the second case, 139 generations (σ=48). So it is quite probable that observe such a reversal, as seen on the video, two. For the bigger population such an occurence is at least more likely than winning a lottery....

Here is a complete view for various rates of mutation ( 1% - 10%) and populations from 1 to 100:

Monday, August 24, 2009

Dawkin's weasel - it isn't that bad...

Richard Dawkins writes about his weasel alogrithm:

Although the monkey/Shakespeare model is useful for explaining the distinction between single-step selection and cumulative selection, it is misleading in important ways.

But it is useful is just another way: It helps to illustrate the importance of a sensible choice of the rate of mutation!
I calculated the expected number of generations for ten different rates of mutation per letter: 1%, 2%,..., 10%. As the graph - and the table - shows, the algorithm works best with a moderate rate of 4% or 5%.

	25	50	100
	size of the population
1 %	534	274	145
2 %	335	175	96
3 %	280	148	83
4 %	263	139	79
5 %	266	140	79
6 %	296	146	82
7 %	434	159	88
8 %	1394	179	96
9 %	15313	229	107
10 %	398,256	429	123

That's not much of a surprise, but contrasts with the algorithm W. Dembski declared to be the weasel: his version works best with a maximal rate of mutation...

A story of two weasels

In his boock ''The Blind Watchmaker'', Richard Dawkins presents a small program which became known as ''Dawkins Weasel''. He wanted to show the mechanisms of mutation and selection. William Dembski has criticized Dawkins program, and claimed to have done so in his Conservation of Information in Search: Measuring the Cost of Success: Though he doesn't name Dawkins in this essay, in his blog he states:

In it we critique, for instance, Richard Dawkins METHINKS*IT*IS*LIKE*A*WEASEL (p. 1055). Question: When Dawkins introduced this example, was he arguing pro-Darwinism? Yes he was. In critiquing his example and arguing that information is not created by unguided evolutionary processes, we are indeed making an argument that supports ID.''

But does he even use the algorithm which Richard Dawkins had proposed? This is very doubtful...

Description of the algorithms

Here is the lengthy description which Dawkins gave in his book ''The Blind Watchmaker'':

So much for single-step selection of random variation. What about cumulative selection; how much more effective should this be? Very very much more effective, perhaps more so than we at first realize, although it is almost obvious when we reflect further. We again use our computer monkey, but with a crucial difference in its program. It again begins by choosing a random sequence of 28 letters, just as before:

WDLTMNLT DTJBKWIRZREZLMQCO P

It now 'breeds from' this random phrase. It duplicates it repeatedly, but with a certain chance of random error - 'mutation' - in the copying. The computer examines the mutant nonsense phrases, the 'progeny' of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL. In this instance the winning phrase of the next 'generation' happened to be:

WDLTMNLT DTJBSWIRZREZLMQCO P

Not an obvious improvement! But the procedure is repeated, again mutant 'progeny' are 'bred from' the phrase, and a new 'winner' is chosen. This goes on, generation after generation. After 10 generations, the phrase chosen for 'breeding' was:

MDLDMNLS ITpSWHRZREZ MECS P

After 20 generations it was:

MELDINLS IT ISWPRKE Z WECSEL

By now, the eye of faith fancies that it can see a resemblance to the target phrase. By 30 generations there can be no doubt:

METHINGS IT ISWLIKE B WECSEL

Generation 40 takes us to within one letter of the target:

METHINKS IT IS LIKE I WEASEL

And the target was finally reached in generation 43. A second run of the computer began with the phrase:

Y YVMQKZPFfXWVHGLAWFVCHQXYOPY,

passed through (again reporting only every tenth generation):

Y YVMQKSPFTXWSHLIKEFV HQYSPY

YETHINKSPITXISHLIKEFA WQYSEY

METHINKS IT ISSLIKE A WEFSEY

METHINKS IT ISBLIKE A WEASES

METHINKS IT ISJLIKE A WEASEO

METHINKS IT IS LIKE A WEASEP

and reached the target phrase in generation 64. m a third run the computer started with:

GEWRGZRPBCTPGQMCKHFDBGW ZCCF

and reached METHINKS IT IS LIKE A WEASEL in 41 generations of selective 'breeding'.

And this is Dembski's weasel-algorithm:

Partitioned search s a "divide and conquer" procedure best introduced by example. Consider the L =28 character phrase

METHINKS ∗ IT ∗ IS ∗ LIKE ∗ A ∗ WEASEL. (19)

Suppose that the result of our ﬁrst query of L =28 characters is

SCITAMROFN ∗ IYRANOITULOVE ∗ SAM. (20)

Two of the letters {E, S} are in the correct position. They are shown in a bold font. In partitioned search, our search for these letters is ﬁnished. For the incorrect letters, we select 26 new letters and obtain

OOT ∗ DENGISEDESEHT ∗ ERA∗NETSIL. (21)

Five new letters are found, bringing the cumulative tally of discovered characters to {T, S,E, ∗,E, S,L}. All seven characters are ratcheted into place. The 19 new letters are chosen, and the process is repeated until the entire target phrase is found.

Well, at least it is short.

Obvious differences

While Dawkins mentioned a couple of strings (it duplicates it repeatedly), Dembski only mentions one string.
The mutation rate of Dawkins example seems to be much lower than the Dembski's, at least judging from the first two generations:
Dawkins:
1) WDLTMNLT*DTJBKWIRZREZLMQCO*P
2) WDLTMNLT*DTJBKSIRZREZLMQCO*P

Dembski:
1) SCITAMROFN*IYRANOITULOVE*SAM
2) OOT*DENGISEDESEHT*ERA*NETSIL

(changes in bold font)
Dawkins gives the number of generations for three experiments: 43, 64, and 41. The expected number of generations G, i.e, E(G), for Dembski's algorithm is 104.55 (=Σ_k(1-(1-(1/N)^k)^L), where N is the size of the alphabet and L the number of letters of the phrase, i.e. N = 27, L = 28 .

Claimed differences

Dembski always stressed that in the algorithm a letter once found is ''latched'' or ''ratcheted''. And the strings given as examples in Dawkins's book doesn't exclude that ''latching'' is happening. But Dawkins stated repeatedly that in his algorithm latching doesn't take place.

Conclusion

Even this superficial comparison of the two algorithms shows that both are very different: so to say two kinds of weasels.

DiEbLog