Monday, July 17, 2017

A letter to Winston Ewert

Winston Ewert, Wiliam Dembski, and Robert Marks have written a new book "Introduction to Evolutionary Informatics" Fair to say, I do not like it very much - so I wrote a letter to Winston Ewert, the most accessible of the "humble authors"...
Dear Winston,
congratulations for publishing your first book! It took me some time to get to read it (though I'm always interested in the output of the Evo Lab). Over the last couple of weeks I've discussed your oeuvre on various blogs. I assume that some of you are aware of the arguments at UncommonDescent and TheSkepticalZone, but as those are not peer reviewed papers, the debates may have been ignored. Fair to say, I'm not a great fan of your new book. I'd like to highlight my problems by looking into two paragraphs which irked me during the first reading: In your section about "Loaded Die and Proportional Betting", you write on page 77:
The performance of proportional betting is akin to that of a search algorithm. For proportional betting, you want to extract the maximum amount of money from the game in a single bet. In search, you wish to extract the maximum amount of information in a single query. The mathematics is identical"
This is at odds with the previous paragraphs: proportional betting doesn't optimize a single bet, but a sequence of bets - as you have clearly stated before. I'm well aware of Cover's and Thomas's "Elements of Information Theory", but I fail to say how their chapter on "Gambling and Data Compression" is applicable to your idea of a search. I tried to come up with an example, but if I have to search two equally sized subsets $\Omega_1$ and $\Omega_2$, and the target is to be found in $\Omega_1$ with a probability bigger than to be found in $\Omega_2$, proportional betting isn't the optimal way to go! Does proportional betting really extract the maximum of information in a single guess?

Then there is this following paragraph on page 173:

One’s first inclination is to use an S4S search space populated by different search algorithms such as particle swarm, conjugate gradient descent or Levenberg-Marquardt search. Every search algorithm, in turn, has parameters. Search would not only need to be performed among the algorithms, but within the algorithms over a range of different parameters and initializations. Performing an S4S using this approach looks to be intractable. We note, however, the choice of an algorithm along with its parameters and initialization imposes a probability distribution over the search space. Searching among these probability distributions is tractable and is the model we will use. Our S4S search space is therefore populated by a large number of probability distributions imposed on the search space.
Identifying/representing/translating/imposing a search and a probability distribution is central to your theory. It's quite disappointing that you are glossing over it in your new book! While you give generally a quite extensive bibliography, it is surprising that you do not quote any mechanism which translates the algorithm in a probability distribution.

Therefore I do not know whether you are thinking about the mechanism as described in "Conservation of Information in Search: Measuring the Cost of Success": this one results in every exhaustive search finding its target. Or are you talking about the "representation" in "A General Theory of Information Cost Incurred by Successful Search": here, all exhaustive searches will do on average at best as a single guess (and yes, I think that this in counter-intuitive). As you are talking about $\Omega$ and not any augmented space, I suppose you have the latter in mind...

But if two of your own "representations" result in such a difference between probabilities ($1$ versus $1/|\Omega|$), how can you be comfortable with making such a wide-reaching claim like "each search algorithm imposes a probability distribution over the search space" without further corroboration? Could you - for example - translate the damping parameters of the Levenberg-Marquardt search into such a probability distribution? I suppose that any attempt to do so would show a fundamental flaw in your model: the separation between the optimum of the function and the target....

I'd appreciate if you could address my concerns - at UD, TSZ, or my blog.

Thanks,
Yours Di$\dots$ Eb$\dots$

P.S.: I have to add that I find the bibliographies quite annoying: why can't you add the number of the page if you are citing a book? Sometimes the terms which are accompanied by a footnote cannot be found at all in the given source! It is hard to imagine what the "humble authors" were thinking when they send their interested readers on such a futile search!