DiEbLog: S4S

Showing posts with label S4S. Show all posts

Monday, May 11, 2015

Five Years of "The Search for a Search"

The Journal of Advanced Computational Intelligence and Intelligent Informatics published the paper The Search for a Search: Measuring the Information Cost of Higher Level Search of William A. Dembski and Robert J. Marks II (DM) in its July edition in 2010. With the five year jubilee of the publication coming, it seems to be appropriate to revisit a pet peeve of mine...

(Shell game performed on Karl-Liebknecht-Straße in Berlin, photograph by E.asphys)

Imagine a shell game. You have observed the con artist for a while, and now you know:

The pea ends up under each of the three shells (left, middle, and right) with the same probability, i.e., $$P(Pea=left)=P(Pea=middle)=P(Pea=right)=1/3$$
If the pea ends up under the left or the middle shell, you are able to track its way. So, in these cases, you will find the pea with probability 1 $$P(Finding\,Pea|Pea=left)=P(Finding\,Pea|Pea=middle)=1$$
However, if the pea ends up under the right shell, in 999 times out 1000, you make a mistake during your tracking and be convinced that it is under the left or the middle shell - the probability of finding this pea is 1/1000$$P(Finding\,Pea|Pea=right)=1/1000$$

You are invited to play the game. Should you use your knowledge (method $M_1$), or should you chose a shell at random (method $M_2)$?

Conservation of Information in Evolutionary Search - Talk by William Dembski - part 4

For an introduction to this post, take a look here.

Part 4: 31' 25" - 45' 00"

( I had to pause at 45', there is such an elementary mistake in Dembski's math, it was just to funny...)

Topics: What is Conservation of Information?

William Dembski: Now let us get to the heart of things "Conservation of Information". What is that conservation? Let me put on the next slide.

William Dembski: This is probably the most gem-packed slide in this talk. I want to make a distinction between -what I call - probable and improbable events, and probable and improbable searches. An improbable event is just something that is high in improbability: flip a coin a thousand times, get a thousand heads in a row. Highly improbable. It happens: if you believe in a multi-universe, then there is a universe where this is happening, where someone like me is speaking, my double-ganger flips a coin over the next hour and sees 1000 heads in a row. Probable and improbable search, that is where what is the probability that a search is successful. It is not so much asking whether it actually succeeds, it is not concerned with the result. It is concerned with the probability distribution associated with the search. This is an important distinction because so many intelligent design arguments look for a discontinuity in the evolutionary process. We look for highly improbable events. Such as the intelligent design people: you get for instance Thomas Nagel's "Mind and Cosmos". He is basically looking at probabilistic miracles. Think how the origin of life undercuts a materialistic understanding of biology. So he is looking into improbable events. That is what we do when we try to find evidence for a discontinuity. What I'm doing in this talk is saying, look, I'm going to give you evolution, give you common ancestry, all of that. That is no problem. What I'm interested though is the probability of success for a search.

member of the audience: What are we searching for?

William Dembski: It is whatever the target happens to be.

Conservation of Information in Evolutionary Search - Talk by William Dembski - part 1

For an introduction to this post, take a look here.

Part 1: 00' 00" - 09' 40''

Topics: Introduction, What is information?

Leo Kadanoff: [???] He went on to broader interests in subjects including information theory, philosophy and parts of biology. The best write-up I could find about him was the Discovery Institute's write-up on the web: "mathematician philosopher William A. Dembski is senior fellow with the Discovery Institute. He has taught at the Northwestern University, the University of Notre Dame, and the University of Dallas. He has done postdoctoral work in mathematics at MIT, in physics in Chicago, and in computer science at Princeton. He is a graduate of the University of Illinois, of the University of Chicago, and of Princeton.
His fields include mathematics, physics and philosophy, as well as theology. We probably hear only a fraction of those interests today in his talk about the "Creation of Information in Evolutionary Search".

William Dembski: Okay, well, Leo, it is a pleasure to be back here. Leo was my adviser back in 87/88, along with Patrick Billingsley and [???]. The topic is actually "Conservation of Information in Evolutionary Search. I want to speak about that

Leo Kadanoff: I said creation! [???]

William Dembski: I'm called a creationist enough, so I make that distinction when I can. What I will describe is the work that I have done with the Evolutionary Informatics Lab - this is their website.

Dembski's, Ewert's and Marks's Concept of a Search Applied to Exhaustive Searches

At Uncommon Descent, Winston Ewert, co-author of the paper A General Theory of Information Cost Incurred by Successful Search, writes:

"The search is defined to be a six-tuple consisting of the initiator, terminator, inspector, navigator, nominator, and discriminator. The paper studies the question of picking a search at random, and that would imply picking each of the six components at random. We did not consider it necessary to specifically state that each individual component was also selected at random. That would seem to be implied.

So, let $\Omega = \{\omega_1, \omega_2, \dots, \omega_N\}$ be our finite search space with $N$ elements. We are looking for a single element $\omega_k$, so we try to maximize the fitness function $f = \chi_{\omega_k}$. To keep everything finite, we don't allow repetitions, i.e., in our search each place can only be visited once. This is - as Macready and Wolpert observed - always possible by keeping a look-up table and thus doesn't change the set-up. Therefore, our search is completed in at most $N$ steps.
(BTW: The claim that "each of the six components [is picked] at random" seems not to apply to the inspector: this is a fixed function for a search - in our case, the inspector returns the value of the fitness function. Of course, you can say that we pick the inspector at random out of the set of the one possible inspector.)
Let's take a look at all the searches which are ended by their terminator only after the $N$-s step, i.e., the subset of all exhaustive searches. The price question: What is the probability to find the target in such an exhaustive search? Until now, everyone looking at such problems would have thought that this probability is one: we certainly visited $\omega_k$ and spotted that the function $f$ takes it maximum there. But in the world of Dembski, Ewert, and Marks it is not, as a random discriminator takes its toll - and discriminators aren't obliged to return the target if it was found and identified...
Counterintuitive? That is a flattering description: the discriminator's purpose seems to be to turn even a search which is successful by all human standards into a guess to fit the idée fixe that each search can be "represented" by a measure on the search space.
Addendum: We can drop the condition of not having repetitions in our searches and just look at those searches which are terminated only after the whole search space was visited: terminators with this property exist. Such searches may have length $N$, but can be much longer. The result is the same: the probability of finding the target during a complete enumeration of the search space is (much) less than one. I have to ask: What good is a model in which an exhaustive search doesn't fare much better than a single guess?

Monday, April 8, 2013

Dembski's and Mark's "The Search for a Search": Trying to work out one of their examples

Three years ago, on April 1st, 2010, W. Dembski's and R. Marks's paper The Search for a Search^[1] was accepted for publication. Recently, W. Dembski referred to it as a seminal paper in his article Before They've Even Seen Stephen Meyer's New Book, Darwinists Waste No Time in Criticizing Darwin's Doubt at the Discovery Institute's Evolution News and Views.. The first result of their paper is the Horizontal No Free Lunch Theorem which shows that average relative performance of [assisted] searches never exceeds unassisted or blind searches. Hopefully this paper is covered at the 2013 Summer Seminar on Intelligent Design In the natural sciences from July 12-20, 2013 in Seattle, WA - at least in the past, Marks and Dembski have been speakers at this seminar. Wouldn't it be a nice warm-up exercise for all these bright students gathered in Seattle to work out an example of an assisted search and see for themselves how the Horizontal No Free Lunch theorem works? I just ask because I haven't seen any such example calculated by Marks, Dembski or one of their students. They have mentioned a few kinds of assisted searches - like Dawkins's weasel, the Easter Egg hunt, etc., but they have never shown how those fit their model.

So I tried to work out a kind of simplified Easter Egg hunt in the language of Dembski and Marks - and I invite everyone to try to elaborate similar examples. My claim: any search which uses a kind of feedback (an oracle, a fitness function, etc.) cannot be described meaningfully in the terms of Marks and Dembski - which cast a doubt on their far-reaching conclusions about assisted searches in general. But see for yourself:

Some Annotations to the Previous Post

1. Joe, at this point I'd advice students to draw a decision tree. Some would draw one with six nodes in the first layer, representing the machines $M_1, M_2, ... , M_6$ and then 36 nodes in the second layer, representing each of the possible outcomes from $1,2,...,6$ for each of the machines. At each of the branches, they put the possibility to get from one node to the next, and at the end of the the diagram they write down the 36 probabilities for the outcomes which they get by multiplying the probabilities on the branches which lead to the outcome. However, others would opt for a much easier design, summarizing the machines $M_2, M_3, ..., M_6$ as $\overline{M_1}$, and the non-desirable outcomes $\{1,2, ...,5\}$ as $\overline{6}$, which leads to the following graph:

Could you please correct your miscalculation, Dr. Dembski?

William A. Dembski wrote "a long article […] on conservation of information. " at Evolution News and Views (ENV), an outlet of the Discovery Institute. Others have commented on more sophisticated problems, either at Uncommon Descent or at The Skeptical Zone. Here I want just to correct some simple math which occurs in a toy example used in the article:

To see how this works, let's consider a toy problem. Imagine that your search space consists of only six items, labeled 1 through 6. Let's say your target is item 6 and that you're going to search this space by rolling a fair die once. If it lands on 6, your search is successful; otherwise, it's unsuccessful. So your probability of success is 1/6. Now let's say you want to increase the probability of success to 1/2. You therefore find a machine that flips a fair coin and delivers item 6 to you if it lands heads and delivers some other item in the search space if it land tails. What a great machine, you think. It significantly boosts the probability of obtaining item 6 (from 1/6 to 1/2).

Is the average active information a suitable measure of search performance?

(Though the following doesn't include any maths, the reader is expected to be familiar with William Dembski's and Robert Marks's paper The Search for a Search and should have glanced at On a Remark by Robert J. Marks and William A. Dembski)

One of my problems with the modeling of searches by William Dembski and Robert Marks is that I don't see how every assisted search can be described as a probability measure on the space of the feasible searches. But nevertheless, Winston Ewert insisted that

All assisted search, irrespective of the manner in which they are assisted, can be modeled as a probably distribution biased towards selecting elements in the target.

Marks and Dembski claim that the average active information is a measure of search performance - at least they write in their remark:

If no information about a search exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in negative active information, thereby rendering the search performance worse than random search.

Their erratum seems indeed to proof the remark in a slightly modified way:

Given a uniform distribution over targets of cardinality k, and baseline uniform distribution, the average active information will be non-positive

(The proof of this statement in the erratum is correct - at least as far as I can see...)
So, lets play a game: From a deck of cards one is chosen at random. If you want to play, you have to pay 1\$, and you get 10\$ if you are able to guess the card correctly. But you are not alone, there are three other people A, B and (surprisingly) C who'll announce their guesses first. They use the following search strategies:

A: he will announce a card according to the uniform distribution
B: he will always announce ♦2
C: He has access to a very powerful oracle, which gives him the right card. Unfortunately - due to an old superstition - he is unable to say ♦2, so every time this card appears he will announce another one at random

Which strategy will you follow? Siding with player A or B gives you a chance of 1/52 for a correct guess, so you will loose on average ca. 81¢ per game. However, if you pose your bet with C, you will win 8.81$ a game in the long run! That's because the probability of a correct guess is 1/52 for both our players A and B, while C's chance for success is 51/52.

But what would Marks, Dembski or Ewert do? They calculate the average active information according to the formula given in the erratum. This E[I₊] is 0 for player A, but -∞ for B and C. As negative active information on average renders the search performance worse than random search, they have to stick with player A.

So, either average active information is not a good measure of search performance, or not every assisted search, irrespective of the manner in which they are assisted can be modeled as a probably distribution. Or it is a bit of both...

Thursday, May 24, 2012

A new erratum for Dembski's and Marks's The Search for a Search

Last month, I made a post On a Remark by Robert J. Marks and William A. Dembski where I addressed errors in a section of Dembski's and Marks's paper The Search for a Search. I exchanged emails over this with Winston Ewert, one of the people at The Evolutionary Informatics Lab ( a former?/grad? student of Marks). He informed me:

You also make brief mention that the HNFLT proof assumes a partition of the search space and thus cannot handle the overlapping targets. This is problematic because any target on the original search space will become an overlapping target on the multi-query search space. I think you are correct on this point. Thanks for bringing it to our attention. We've added a description of the problem as well as an alternate proof for the uniform case on overlapping targets to the pdf: http://evoinfo.org/papers/2010_TheSearchForASearch.pdf.

Here are some thoughts:

On a Remark by Robert J. Marks and William A. Dembski

Abstract

In their 2010 paper The Search for a Search: Measuring the Information Cost of Higher Level Search, the authors William A. Dembski and Robert J. Marks II present as one of two results their so-called Horizontal No Free Lunch Theorem. One of the consequences of this theorem is their remark: If no information about a search exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in negative active information, thereby rendering the search performance worse than random search. This is quite surprising, as one would expect in the tradition of the No Free Lunch theorem that the performances are equally good (or bad). Using only very basic elements of probability theory, this essay shows that their remark is wrong - as is their theorem.

Definitions

The situation is quite simple: There is a set $\Omega_1$ - the search space - with $N$ elements, so w.l.o.g. $\Omega = \{1,\,\ldots,\,N\}$. Let $T_1 \subset \Omega_1$ denote the target. At first assume that $T_1$ consists of a single element. To find this element $T_1$ you have $K$ guesses and you aren't allowed to guess the same element twice. Such a sequence of $K$ different guesses can be described as an ordered $K$-tuple $(n_1,\,n_2,\,\ldots,\,n_K)$ of $K$ pairwise different numbers and is called a search. The $\frac{N!}{(N-K)!}$ different searches build the space $\Omega_K$. A search strategy is a probability measure $\mathbf{Q}$ on $\Omega_K$. Such a measure can be given by assigning a probability $\mathbf{Q}((n_1,\,n_2,\,\ldots,\,n_K)) = q_{(n_1,\,n_2,\,\ldots,\,n_K)}$ for each of the (pure) strategies $(n_1,\,n_2,\,\ldots,\,n_K)$ in $\Omega_K$: $q_{(n_1,\,n_2,\,\ldots,\,n_K)}$ is the probability to choose the strategy $(n_1,\,n_2,\,\ldots,\,n_K)$ from $\Omega_K$. So obviously, $q_{(n_1,\,n_2,\,\ldots,\,n_K)} \ge 0\; \forall (n_1,\,n_2,\,\ldots,\,n_K)\in\Omega_K$ and $\sum_{(n_1,\,n_2,\,\ldots,\,n_K) \in \Omega_K} q_{(n_1,\,n_2,\,\ldots,\,n_K)} =1 $. Random search (i.e., the random search strategy) correspond to the uniform distribution $\mathbf{U}$ on $\Omega_K$, i.e. $\mathbf{U}((n_1,\,n_2,\,\ldots,\,n_K)) = \frac{(N-K)!}{N!}$. Any other search strategy is called an assisted search. This shows the small scope of the concept of an assisted search in the papers of Dembski and Marks: it is not possible to encode the influence of an oracle or a fitness function into the measure $\mathbf{Q}$, so most examples presented by the authors (like partioned search or easter egg hunt are not covered by this model. The performance of a search strategy $\mathcal{P}(\mathbf{Q},\delta_{T_1})$ for finding an element $T_1 \in \Omega_1$ is defined as the probability to name $T_1$ in one of the $K$ guesses of the search.

Some calculations

Using the characteristic function $\chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(T_1)= \left\lbrace \begin{array}{cl} 1 & if\;T_1 \in \{n_1,\,n_2,\,\ldots,\,n_K\} \\ 0 & otherwise \end{array} \right. $ this probability can be written as: $\mathcal{P}{(\mathbf{Q},\delta_{T_1})} = \sum_{(n_1,\,n_2,\,\ldots,\,n_K) \in \Omega_K} q_{(n_1,\,n_2,\,\ldots,\,n_K)} \cdot \chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(T_1)$ The characteristic function is independent of the order of the elements $n_1, n_2, \ldots n_K$. So the expression can be streamlined using instead of the space of ordered n-tuples $\Omega_K$ the space $\mathfrak{P}_K(\Omega_1)$ of subsets $\{n_1,\,n_2,\,\ldots,\,n_K\}$ with $K$ elements of $Q_1$. This space has ${N \choose K}$ elements. $\mathbf{Q}$ on $\Omega_K$ gives instantly rise to a measure on $\mathfrak{P}_K(\Omega_1)$ - again called $\mathbf{Q}$ - by setting: $ \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) = \sum_{\sigma \in S_K} q_{(\sigma_1(n_1),\,\sigma_2(n_2),\,\ldots,\,\sigma_K(n_K))}.$ Here, $\sigma = (\sigma_1,\sigma_2,\ldots,\sigma_K)$ are the $K!$ elements of the group of permutations of $K$ elements, $S_K$. Thus, the probability to search successfully for $T_1$ is $\mathcal{P}{(\mathbf{Q},\delta_{T_1})} = \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) \cdot \chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(T_1).$ But how does this search perform on average - assuming that the underlying measure is uniform? The underlying measure gives the probability with which a target element is chosen: if there is a uniform underlying measure $\mathbf{U}_{\Omega}$, each element $1,2,\ldots, N$ is a the target with a probability of $\frac{1}{N}$. The average performance is given by: $\mathcal{P}{(\mathbf{Q},\mathbf{U}_{\Omega})} =\sum_{l=1}^N \frac{1}{N}\sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) \cdot \chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(l)$ $= \frac{1}{N}\sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) \cdot \left(\sum_{l=1}^N \chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(l)\right)$ $= \frac{1}{N}\sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) \cdot K $ $=\frac{K}{N} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\})$ $=\frac{K}{N}.$ This holds for every measure $\mathbf{Q}$ on the space of the searches, especially for the uniform measure, leading to the
Special Remark: If no information about a search for a single element exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in the same search performance as random search.
But perhaps this result depends on the special condition of an elementary target? What happens when $T_m = \{t_1, t_2, \ldots, t_m\} \in \mathfrak{P}_m(\Omega _1)$? Then a search is seen as a success if it identifies at least one element of the target $T_m$, thus $\mathcal{P}{(\mathbf{Q},\delta_{T_1})} = \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} \cdot \max_{t \in T_m}\{\chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(t)\}.$ Calculating the average over all elements of $\mathfrak{P}_m(\Omega_1)$ results in: $\mathcal{P}{(\mathbf{Q},\mathbf{U}_{\mathfrak{P}_m(\Omega_1)})} = \frac{1}{{N \choose m}} \sum_{T \in \mathfrak{P}_m(\Omega_1)} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} \cdot \max_{t \in T_m}\{\chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(t)\}.$ $=\frac{1}{{N \choose m}} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} \sum_{T \in \mathfrak{P}_m(\Omega_1)} \max_{t \in T_m}\{\chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(t)\} $ $=\frac{1}{{N \choose m}} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} \cdot \left({N \choose m} - {N-K \choose m}\right) $ $= \frac{{N \choose m} - {N-K \choose m}}{{N \choose m}} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} $ $= \frac{{N \choose m} - {N-K \choose m}}{{N \choose m}} = 1-\frac{(N-K)(N-K-1)\ldots (N-K-m+1)}{N(N-1)\ldots (N-m+1)}.$ Again the probability of a successful search - and by definition the performance - is on average independent of the measure used on the space of the searches! (Note that the expression becomes $\frac{K}{N}$ for $m=1$ and $1$ for $N-K < m$). The result of this calculation is the
Less Special Remark: If no information about a search for a target exists other than the number of elements the target exists from, so that the underlying measure is uniform, then, on average, any other assumed measure will result in the same search performance as random search.
As the performance of the random search and any other search doesn't differ on any subset of targets of fixed cardinality, the next remark is elementary:
General Remark: If no information about a search exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in the same search performance as random search.
Obviously this General Remark negates the
Remark of Dembski and Marks: If no information about a search exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in negative active information, thereby rendering the search performance worse than random search.

Consequences for the Horizontal No Free Lunch Theorem

There is much critique on this theorem out there, e.g., "A Note Regarding ``Active Entropy". The previous paragraphs add to this: Dembski's and Marks's remark is a direct consequence of the theorem, and as the remark is wrong, the theorem can't be right, neither. How is it possible for two mathematicians to err on such a crucial point? Perhaps they were seduced by the elegance of their argument: the calculations in the sections above are simple and a little bit clumsy. Dembski and Marks smartly augmented the original search space $\Omega_1$ and identified it with the space of pure search strategies $\Omega_K$. For each original target $T_1$ a corresponding target $\tilde{T_1}$ can be found in $\Omega_K$, such that a simple search $(n_1, \ldots, n_K)$ is successful when it is an element of $\tilde{T_1}$. So the problem of searching by $K$ guesses on $\Omega_1$ is reduced to a simple guess on $\Omega_K$. The problem which they overlooked: the corresponding targets aren't nice subsets. Especially if $K>1$, then the corresponding sets for two different targets in the original space have always a non-empty intersection. Nevertheless, Dembski and Marks use an exhaustive partition of the augmented space in their proof of the Horizontal No Free Lunch theorem, but there is no such partition into subsets which makes sense in the framework of their argument.

Conclusion

Dembski and Marks have introduced their concept of Active Information using a great number of examples. Their model for an assisted search works only for one or two of these examples. The theoretical results presented in The Search for a Search seem not to work at all. That leaves the Active Information as being a new performance measure, but it can't be seen how it improves the understanding of the concept of information in connection with search algorithms.

Sunday, October 10, 2010

Bernoulli vs. Clark Kent

When discussing the consequences of their Horizontal No Free Lunch theorem, the authors W. Dembski and R. Marks state:

Because the active entropy is strictly negative, any uninformed assisted search will on average perform worse than the baseline search.

Here, the baseline search is the blind search. But what is an assisted search? As an example, the authors have introduced the perfect search

an assisted search guaranteed to succeed

Let's look at an example: a shell game - three shells, one pea. The operator is a fair man, who will hide the pea with probability 1/3 under any of the three shells.

The players are Bernoulli and Clark Kent. Knowing the paper of W. Dembski and R. Marks on the HNFLT, Bernoulli makes a guess at random, using a uniform distribution on the shells.

Clark Kent just picks the shell which hides the pea (remember - Clark Kent is Superman...), thereby performing a perfect search.

Who fares better? According to W. Dembski and R. Marks, both do equally well: observing the picking pattern of Clark Kent, they conclude that Clark Kent uses a uniform measure on the search space - as Bernoulli does: unfortunately, the mathematics of their paper has no way to deal with a strategy which is influenced by the position of the target (as any assisted search would be.)

Friday, September 24, 2010

Draft of a letter to JACIII

In their article The Search for a Search¹ the authors William Dembski and Robert Marks seem to assume that any search on a space induces a probability measure on the search space, as they write in section 3.1 (Active Information):

"Let U denote a uniform distribution on Ω characteristic of an unassisted search and φ the (nonuniform) measure of Ω for an assisted search. Let U(T) and φ(T) denote the probability over the target set T ∈_[sic]Ω."

Such an induced measure gives the probability to find a subset T of Ω when performing the corresponding search. But for the assisted searches mentioned in the text (a "perfect search" - "indicating an assisted search guaranteed to succeed" - and another search "indicating an assisted search guaranteed to fail"), such probability measures obviously do not exist! Thus these cases are not covered by their Theorem 1 (Horizontal No Free Lunch), as the Kullback-Leibler-distance is defined only for probability measures.

But perhaps the authors intended their theorem only to be valid for those searches which induce a measure on the search space - indeed, they use the term uninformed assisted searches when describing their results. Unfortunately, this term is not defined - neither in the current text, nor in their previous publications Conservation of Information in Search - Measuring the Cost of Success² or Efficient Per Query Information Extraction from a Hamming Oracle³.

Unfortunately, even in this case, the theorem does not work: assuming the easiest nontrivial case of guessing an element of a set with two elements (tossing a coin), Ω = {head, tail}, the two strategies naming an element at random (uniform distribution on

\Omega\,

) and sticking with one element (dirac distribution) give the same success rate for a fair coin. The same result occurs when averaging over the probability of showing head (assuming that this probability is uniformly distributed on [0,1]), contradicting the statement of William Dembski and Robert Marks:

"Because the active entropy is strictly negative, any uninformed assisted search (φ) will on average perform worse than the baseline search". (Highlighting mine)

In my opinion, the fundamental problem stems from identifying the search space Ω_(n) and the space of (deterministic) searches. This is doubtless elegant, but does not seem to work.

References:

¹W. A. Dembski and R. J. Marks II, "The Search for a Search: Measuring the Information Cost of Higher Level Search," Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol. 14, No. 5, pp. 475-486, September 2010

²W. A. Dembski and R. J. Marks II, "Conservation of Information in Search: Measuring the Cost of Success," IEEE Trans. on Systems, Man and Cybernetics A, Systems and Humans, Vol. 39, No. 5, pp. 1051-1061, September 2009

³W. Ewert, G. Montanez, W. A. Dembski, and R. J. Marks II, "Efficient Per Query Information Extraction from a Hamming Oracle," Proc. of the 42nd Meeting of the Southeastern Symposium on System Theory, IEEE, University of Texas at Tyler, pp. 290-297, March 7-9, 2010

Sunday, August 29, 2010

Search for Search Paper Finally Out

William A. Dembski informs us at Uncommon Descent that the paper A Search for a Search:Measuring the Information Cost of Higher Level Search (a collaboration with Robert Marks II) is finally published. To things are remarkable about his post:

The comments are turned off
It's so restrained - wasn't this paper once announced as a nail to the coffin of Darwinism?

One obvious flaw I found in an earlier draft of the paper is addressed: instead of talking about any searches, they are now talking about searches without repetition.

But I've still a(t least one) problem with this paper, which may be resolved by more careful reading over the next few days. As I wrote to Robert Marks:

I was a little bit irritated that the proof of the HNFLT still uses the
Kullback-Leibler distance, as I can't see how a non-trivial search-space
(i.e., a search spaces for a search existing from at least two queries)
can be exhaustively partitioned in a meaningful way.

Here's is an example I used earlier: Imagine a shell game with three shells where you are allowed to have two guesses. To put it more formally:

The search space Ω₁ is {1,2,3}

In section 2.1. Blind and Assisted Queries and Searches, Dembski and Marks describe how such searches can be construed as a single query when the search space is appropriately defined. They introduce an augmented search space Ω_Q, existing from the sequences without repetition of length Q, i.e., the number of queries. In our case:

The augmented search space Ω₂ is {(1,2),(1,3),(2,1),(2,3),(3,1),(3,2)}

Of course, the target has to be changed accordingly to T_Q, where T_Q ∈ [sic] Ω_Q consists of all the elements containing the original target.

So, if in the original game the shell with the pea was No. 1, in our new space, T₂ = {(1,2),(1,3),(2,1),(3,1)}. All of this seems to be sensible

Let's have a look at a search strategy, for instance:

First, look under a random shell
Second, look under another random shell

By such a strategy, a probability measure is introduced on Ω₂, and the probability of a successful search for T₂ can be seen immediately: it's |T₂|/|Ω₂| = 4/6 = 2/3. No surprise here.

In fact, any search strategy can be seen as a measure on Ω_Q - and that's what Dembski and Marks are doing in the following. My problem: not any subset of Ω_Q can be seen as a reasonable target for a search: The complement of T₂ doesn't represent any target in the original space Ω₁! And though this set can be measured by the measure introduced above (2/6), this measure doesn't make sense in the context of a search.

And that's what I don't understand about the Horizontal No Free Lunch Theorem (3.2). The first sentence is:

Let φ and ψ be arbitrary probability measures and let T~={T_i}_i=1^N be an exaustive partition of Ω all of whose partition elements have positive probability with respect to ψ

How can Ω₂ partitioned in a sensible way? And why should I try to compare two search strategies on sets which they will never look for?

Monday, May 11, 2015

Sunday, September 28, 2014

Part 4: 31' 25" - 45' 00"

Topics: What is Conservation of Information?

Thursday, September 25, 2014

Part 1: 00' 00" - 09' 40''

Topics: Introduction, What is information?

Sunday, July 14, 2013

Monday, April 8, 2013

Sunday, September 2, 2012

Friday, August 31, 2012

Friday, May 25, 2012

Thursday, May 24, 2012

Saturday, April 21, 2012

Abstract

Definitions

Some calculations

Consequences for the Horizontal No Free Lunch Theorem

Conclusion

Sunday, October 10, 2010

Friday, September 24, 2010

References:

Sunday, August 29, 2010

My Blog List

MathJax

Followers

Blog Archive

About Me