## Sunday, June 23, 2013

### Review of "A General Theory of Information Cost Incurred by Successful Search" - Introduction

(For some background information, go here)

There are two main ways to apply mathematics: the first is to shed light on a subject and look for a deeper understanding, the second just wants to create the impression that something important is happening somehow. After looking into the article "A General Theory of Information Cost Incurred by Successful Search" (free download as pdf) I became convinced that the authors are following the second path.

The abstract states:

This paper provides a general framework for understanding targeted search. It begins by defining the search matrix, which makes explicit the sources of information that can affect search progress. The search matrix enables a search to be represented as a probability measure on the original search space. This representation facilitates tracking the information cost incurred by successful search (success being defined as finding the target). To categorize such costs, various information and efficiency measures are defined, notably, active information. Conservation of information characterizes these costs and is precisely formulated via two theorems, one restricted (proved in previous work of ours), the other general (proved for the first time here). The restricted version assumes a uniform probability search baseline, the general, an arbitrary probability search baseline. When a search with probability q of success displaces a baseline search with probability p of success where q > p, conservation of information states that raising the probability of successful search by a factor of q/p(>1) incurs an information cost of at least log(q/p). Conservation of information shows that information, like money, obeys strict accounting principles.
The general framework is introduced pp 26 — 38. In my next post, I'll try to relate it to the usual definitions, but I fail to see how this new frameworks improves e.g., the ideas of David Wolpert and William G. Macready significantly (NFLT at wikipedia). pp 38 — 45 provide examples, interestingly without applying the new framework to them. Then follow a couple of pages with sound math (pp. 45 — 61), it is just not clear what they have to do with the claims the authors are making. For their mathematics to work, they have to show that searches can be represented as measures. Indeed, the authors write:
"This representation will be essential throughout the sequel. (p. 37)
I will elaborate how I think that the authors failed to do so, and that the "representation" is at least a misnomer... Another point will be the subject of "Information Cost": this term isn't defined in the paper...

1. In the conventional NFL analytic framework, the objective (cost, fitness) function is a component of the problem. Dembski, Ewert, and Marks turn the objective function into an "oracle" that is part of the problem-solver itself. This model is inappropriate to most, if not all, of the evolutionary computations they purport to have analyzed.

Back in the 1990's, Dembski committed himself to the misconception that Richard Dawkins' Weasel program used the fitness function in order to "hit the target." Various people have tried, with no apparent success, to explain to him that one of the offspring in each generation survives because it is the most fit. The so-called target is nothing but the fittest individual.

To put it simply, the fitness function comes first. The "target" is defined in terms of the fitness function. Dembski gets this backwards. He believes that the target comes first, and that the fitness function is defined in terms of the target.

Dembski and Marks carry this to extreme in "Life's Conservation Law." They claim that implicit biological targets exist in nature, and that if Darwinian evolution succeeds in "hitting" them, then fitness functions necessarily guide evolution to the targets. A remarkable aspect of this claim is that they treat the fitness function, which is an abstraction appearing in mathematical models of evolution, as though it really exists.

The "search for a search" is another abstraction that they reify. A probability measure on the sample space is a mathematical abstraction. They merely assert that a search practitioner, in selecting a search, searches the uncountably infinite set of probability measures. To that I say, "Give me a physical description of the process."

1. As usual, you are correct, therefore I want to present a conventional framework a la Macready et. el. at first, and than contrast it with Dembski's & Marks's. There are so many problems with D&M's approach: e.g., the "probability for belonging to the target" - the second row of the "search matrix", which becomes the output of a fitness function later on. Or the third row, which is just superfluous. On the other hand, we have cute little helicopters - that must be counted as a plus, I suppose.

Then there is the problem that a search algorithm in D&M's world will only work on one singular target, one singular objective function, and not on a family of those: if you change the target, the search will be "represented" by another measure!

But the whole charade is just created to claim that searches can be somehow represented by measures - which ist just wrong.

As usual, I try to give a simple example: Look for the element 1 in the set {1,2,3,4,5,6}:

First algorithm: a simple guess at random.

Second algorithm: two guesses, each time guessing 1 with a probability of $1-\sqrt{1/6}$, and any other number with a probability of $\sqrt{1/6}/5$.

The obvious discriminator returns the number 1 if it was identified correctly, another number of your guess (or guesses) randomly, if it wasn't identified.

It is obvious that both strategies yield the same representation. But they are very different, especially for other targets...

2. Ouch, $1-\sqrt{5/6}$ resp. $\sqrt{5/6}/5$, obviously...

2. Hi, DiEb. Just found your blog and from what I've seen so far it's awesome! I'm just wondering, has UD responded to anything you've written? Tried to do a namesearch on you at UD but came out empty.