I thought about different ways to present my ideas on W. Dembski's and R. Marks's article Conservation of Information in Search: Measuring the Cost of Success, and tried a few out in the last posts. The section which I'm interested in is the description of three evolutionary algorithms they give on pp. 1055-1057. Here's another approach for III E - G:
Original Text | Annotiations | |
---|---|---|
E. Partitioned Search | ||
Partitioned search [12] is a “divide and conquer” procedure best introduced by example. | The name partitioned search seems to be an invention of R. Marks and W. Dembski. The reference is made to Dawkins's book The Blind Watchmaker in which the phrase can't be found. (See Tom English's blog) | |
Consider the L = 28 character phraseMETHINKS∗IT∗IS∗LIKE∗A∗WEASEL . (19) | This is indeed a phrase which is used in Dawkins's book - in an algorithm with which Dawkins explained the idea of cumulative selection. | |
Suppose that the result of our first query of L =28 characters isSCITAMROFN∗IYRANOITULOVE∗SAM . (20) | An example of the egregious humorous skills of Dembski and Marks: backwards, we get:mas*evolutionaryi*nformatics That's no problem, as the first phrase of the algorithm can be arbitrarily chosen | |
Two of the letters {E, S} are in the correct position. They are shown in a bold font. In partitioned search, our search for these letters is finished. | At least, that's the way Dembski's and Marks's search works | |
For the incorrect letters, we select 26 new letters and obtainOOT∗DENGISEDESEHT∗ERA∗NETSIL . (21) | listen*are*thesedesigned*too hilarious. And an sign that we don't see the output of an actual program, but something imagined to be a run of their algorithm. BTW, the fitness function would have to encode the position of the correct letters, and the possible outcomes of the function wouldn't be a totally ordered set, but only partially ordered (what's better: METHINKS*IT*ISXXXXXXXXXXXXXX or XXXXXXXXXXXXXX*LIKE*A*WEASEL ). That's at least unusual, and perhaps a reason that no one else uses partitioned search. | |
Five new letters are found, bringing the cumulative tally of discovered characters to {T, S,E, ∗,E, S,L}. All seven characters are ratcheted into place. The 19 new letters are chosen, and the process is repeated until the entire target phrase is found. | This ratcheting into place is special for the algorithm: the algorithm described in Dawkins's book doesn't show it. | |
Assuming uniformity, the probability of successfully identifying a specified letter with sample replacement at least once in Q queries is 1 − (1 − 1/N)Q, and the probability of identifying all L characters in Q queries is q = (1 − (1 − 1/N)Q)L (22) | Yep, that's true. But why not doing a little math and giving us the expected number of queries? That would be a little bit less banal. And it's helpful if you want compare different algorithms | |
For the alternate search using purely random queries of the entire phrase, a sequence of L letters is chosen. The result is either a success and matches the target phrase, or does not. If there is no match, a completely new sequence of letters is chosen. To compare partitioned search to purely random queries, we can rewrite (5) as p =1 − (1 − (1/N)L)Q (23) | yup. | |
For L =28 and N =27 and moderate values of Q,we have p << q corresponding to a large contribution of active information. The active information is due to knowledge of partial solutions of the target phrase. Without this knowledge, the entire phrase is tagged as “wrong” even if it differs from the target by one character. | So, how big is this active information? For p, it was calculated in section III, A as I+(p) = log(Q), and using the same approximation, we get I+(q) = L log(Q) (that's only true-ish for small values of Q and large alphabets...) | |
The enormous amount of active information provided by partitioned search is transparently evident when the alphabet is binary. Then, independent of L, convergence can always be performed in two steps. From the first query, the correct and incorrect bits are identified. The incorrect bits are then complemented to arrive at the correct solution. Generalizing to an alphabet of N characters, a phrase of arbitrary length L can always be identified in, at most, N − 1 queries. The first character is offered, and the matching characters in the phrase are identified and frozen in place. The second character is offered, and the process is repeated. After repeating the procedure N − 1 times, any phrase characters not yet identified must be the last untested element in the alphabet. | Wow, the hangman's game. In detail. | |
Partitioned search can be applied at different granularities. We can, for example, apply partitioned search to entire words rather than individual letters. Let there be W words each with L/W characters each. Then, partitioned search probability of success after Q queries is pW=(1 − (1 − N−L/W)Q)W. (24) | What's that all about? Imagine an alphabet of 32 letters - including {A,B,...,Z,*} and our weasel-phrase. Then the phrase could also be encoded by 28 5-bit words. One 5-bit word is only correct, if all 5 bits are correct. Therefore, we get the same expressions for N=32, L=28, W=28 and N=2, L=140 and W=28. | |
Equations (22) and (23) are special cases for W = L and W =1.If N−L/W << 1, we can make the approximation pW ≈ QWN−L from which it follows that the active information is I+ ≈ W log2 Q. (25) With reference to (6), the active information is that of W individual searches: one for each word. | So, for W=L, we get I+(q) = L log(Q). My interpretation: virtually, this algorithm provides you with L fitness functions, one for each letter, indicating whether it is correct or not. BTW: I think it is surprising how the fitness functions get ignored during the whole article... | |
At present, there are 116 hits for partitioned search in Google Scholar. Scanning over what I can see in Scholar, it is evident that usage of the term generally does not match that of Dembski and Marks. I do not have it in me to dig in and see if someone else has used the term as they do.
ReplyDeleteD&M's use of "partitioned search" does not make sense to me. I can see refinements of a partition of the search space as the search proceeds, with the search restricted to blocks in refined partitions. But the term still seems weird.
The "divide and conquer" makes plenty of sense. If I recall correctly, D&M previously wrote of parallel search. If they wrote out their algorithm explicitly, there would be a "for i in {1, ..., L} do" loop that could be turned into "do parallel" for L non-communicating processes.
The algorithm is precisely equivalent to searching for * in S = {*, A, ..., Z} repeatedly (L times). D&M have rendered the string of characters irrelevant. Their probability of success is the probability that none of L independent searches for * in S fails.
ReplyDeleteAmusing observation! BTW, I'll work on this article here, the format of the blog isn't math-friendly...
ReplyDelete