We are all waiting for the ultimate book on Intelligent Design, written by R. Marks and W. Dembski. Instead we get a "textbook", another attempt to explain the concepts to laymen. I got the impression that the authors used this setting to avoid the necessary rigour: they just do not define terms like "search" which they use hundreds of times. This allows for a lot of hand-waving, like the following sentence on p. 174:Not surprisingly, I gave it only two stars.
"We note, however, the choice of an algorithm along with its parameters and initialization imposes a probability distribution over the search space"
That unsubstantiated claim is essential for their following proofs on "The Search for a Search"!
And then there are details like this one:
p. 130: "For the Cracker Barrel puzzle [we got] an endogenous information of I = 7.15 bits"
p. 138: "We return now to the Cracker Barrel puzzle. We showed that the endogenous information [...] is I = 7.4 bits"
I tried to solve this conundrum, but I came up with I = 7.8 bits. I contacted the authors, but got no reply.
Some Details on the Cracker Barrel Puzzle
A more complete quote from p. 130 is:For the Cracker Barrel puzzle, all of the 15 holes are filled with pegs and, at random, a single peg is removed. This starts the game. Using random initialization and random moves, simulation of four million games using a computer program resulted in an estimated win probability p = $0.007\,0$ and an endogenous information of $$I_\Omega = − \log_2\,p\;=\;7.15\,bits.$$They didn't calculate the correct value, but they simulated the puzzle 4,000,000. A simulation is the most easy programmable way to get a result - but how good is it? It should be pretty good: performing one simulation is a Bernoulli trial with a probability of success $p_t$, the theoretical probability to win a single game by chance. Repeating 4,000,000 Bernoulli trials leads to a binomial experiment $B(4,000,000; p_t)$, so $\sigma = 0.000\,042$ for $p_t$ - that's why stating four positions after the decimal point isn't overconfident: assuming that there is no systemic error, then the probability that the actual value $p_t$ lies within $0.007\,00 \pm 0.000\,05$ is $77\%$.
Giving three significant digits for $I_\Omega$ oversells the power of their experiment slightly: this implies that they expect $p_t$ to be in the interval $[0.007\,067;0.007\,065]$ with a reasonable probability - but the probability is at best about $44\%$.
Confining themselves to only two significant digits on p. 138: $I_\Omega = 7.4\;bits$ yields much more reliable results: again, assuming that there is nothing systematically wrong with their calculation, they can say that $p_t$ is in $[0.005\,72;0.006\,30]$ with a probability of more than $99.999\,99\%$! Well done...
Or not: it is very improbably that both values are correct. Very, very, very, very - using the most favourite estimations, then the second result should only occur with a probability of less than $10^{-98}$ if the first experiment was correctly implemented. It is even worse the other way around: $10^{-112}$.
Which value is correct?
Not surprising the answer: both are wrong - the three authors somehow botched the implementation of even the easiest way to approach the question - a simulation. How can I be so cock-sure? I simulated it myself - 4,000,000 times - and got a value of $p = 0.004\,5$. Then, I calculated the theoretical value by enumerating all possible games and their respective probabilities: again, $p = 0.004\,5$. Then, I published part of my code at The Sceptical Zone, and thankfully, Roy and Corneel also implemented a simulation - which got compatible results. Lastly, Tom English programmed the problem much more cleverly, getting exactly the same results as I (I just had to wait for mine much longer...)
Why didn't the authors do the same?
No comments:
Post a Comment