Sunday, September 2, 2012
Some Annotations to the Previous Post
Friday, August 31, 2012
Could you please correct your miscalculation, Dr. Dembski?
To see how this works, let's consider a toy problem. Imagine that your search space consists of only six items, labeled 1 through 6. Let's say your target is item 6 and that you're going to search this space by rolling a fair die once. If it lands on 6, your search is successful; otherwise, it's unsuccessful. So your probability of success is 1/6. Now let's say you want to increase the probability of success to 1/2. You therefore find a machine that flips a fair coin and delivers item 6 to you if it lands heads and delivers some other item in the search space if it land tails. What a great machine, you think. It significantly boosts the probability of obtaining item 6 (from 1/6 to 1/2).
Friday, May 25, 2012
Is the average active information a suitable measure of search performance?
One of my problems with the modeling of searches by William Dembski and Robert Marks is that I don't see how every assisted search can be described as a probability measure on the space of the feasible searches. But nevertheless, Winston Ewert insisted that
All assisted search, irrespective of the manner in which they are assisted, can be modeled as a probably distribution biased towards selecting elements in the target.Marks and Dembski claim that the average active information is a measure of search performance - at least they write in their remark:
If no information about a search exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in negative active information, thereby rendering the search performance worse than random search.Their erratum seems indeed to proof the remark in a slightly modified way:
Given a uniform distribution over targets of cardinality k, and baseline uniform distribution, the average active information will be non-positive(The proof of this statement in the erratum is correct - at least as far as I can see...)
So, lets play a game: From a deck of cards one is chosen at random. If you want to play, you have to pay 1\$, and you get 10\$ if you are able to guess the card correctly. But you are not alone, there are three other people A, B and (surprisingly) C who'll announce their guesses first. They use the following search strategies:
- A: he will announce a card according to the uniform distribution
- B: he will always announce ♦2
- C: He has access to a very powerful oracle, which gives him the right card. Unfortunately - due to an old superstition - he is unable to say ♦2, so every time this card appears he will announce another one at random
But what would Marks, Dembski or Ewert do? They calculate the average active information according to the formula given in the erratum. This E[I+] is 0 for player A, but -∞ for B and C. As negative active information on average renders the search performance worse than random search, they have to stick with player A.
So, either average active information is not a good measure of search performance, or not every assisted search, irrespective of the manner in which they are assisted can be modeled as a probably distribution. Or it is a bit of both...
Thursday, May 24, 2012
A new erratum for Dembski's and Marks's The Search for a Search
You also make brief mention that the HNFLT proof assumes a partition of the search space and thus cannot handle the overlapping targets. This is problematic because any target on the original search space will become an overlapping target on the multi-query search space. I think you are correct on this point. Thanks for bringing it to our attention. We've added a description of the problem as well as an alternate proof for the uniform case on overlapping targets to the pdf: http://evoinfo.org/papers/2010_TheSearchForASearch.pdf.Here are some thoughts:
Tuesday, May 15, 2012
9,000!
The post before this one was UD’s 9,000th. Thank you to all of our readers for your support as we celebrate this milestone.So congratulations! But a comment by SCheesman pours a little water into the celebratory wine:
I wish I could celebrate, but I fear 9000 is a reflection of a vast inflation in the number rate of postings in the last year or two, with a corresponding decline in comments.I'll try to satisfy the curiosity as good as I can.
I owe a good deal of what I know today about ID from UD, both from a scientific and theological perspective, and used to enjoy the long threads and back-and-forth between proponents and opponents.
But now, many, if not most posts get nary a comment, and the ones engendering some debate often are lost in the crowd. Since the recent purge of participants who failed to pass what amounted to a purity test, it’s been pretty quiet here. The most lively recent discussion featured a debate between OEC’s and YEC’s. Now I enjoy that sort of thing (like on Sal Cordova’s old “Young Cosmos” blog), but it’s hardly what UD used to be known for.
Maybe the new format gets more visitors than it used to, but I’d be interested in seeing the stats, including comments per post, posts per month, unique visitors etc. over the last few years.
I miss the old days. I expect a lot of us do.
Posts per month
Monday, April 23, 2012
Uncommon Descent in numbers

That's not shabby: I don't think that Panda's thumb (including AtBC) or the discussion board at www.RichardDawkins.net are more busy, though PZ Myers would probably be disappointed by such a low turn out.
Saturday, April 21, 2012
On a Remark by Robert J. Marks and William A. Dembski
Abstract
In their 2010 paper The Search for a Search: Measuring the Information Cost of Higher Level Search, the authors William A. Dembski and Robert J. Marks II present as one of two results their so-called Horizontal No Free Lunch Theorem. One of the consequences of this theorem is their remark: If no information about a search exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in negative active information, thereby rendering the search performance worse than random search. This is quite surprising, as one would expect in the tradition of the No Free Lunch theorem that the performances are equally good (or bad). Using only very basic elements of probability theory, this essay shows that their remark is wrong - as is their theorem.Definitions
The situation is quite simple: There is a set Ω1 - the search space - with N elements, so w.l.o.g. Ω={1,…,N}. Let T1⊂Ω1 denote the target. At first assume that T1 consists of a single element. To find this element T1 you have K guesses and you aren't allowed to guess the same element twice. Such a sequence of K different guesses can be described as an ordered K-tuple (n1,n2,…,nK) of K pairwise different numbers and is called a search. The N!(N−K)! different searches build the space ΩK. A search strategy is a probability measure Q on ΩK. Such a measure can be given by assigning a probability Q((n1,n2,…,nK))=q(n1,n2,…,nK) for each of the (pure) strategies (n1,n2,…,nK) in ΩK: q(n1,n2,…,nK) is the probability to choose the strategy (n1,n2,…,nK) from ΩK. So obviously, q(n1,n2,…,nK)≥0∀(n1,n2,…,nK)∈ΩK and ∑(n1,n2,…,nK)∈ΩKq(n1,n2,…,nK)=1. Random search (i.e., the random search strategy) correspond to the uniform distribution U on ΩK, i.e. U((n1,n2,…,nK))=(N−K)!N!. Any other search strategy is called an assisted search. This shows the small scope of the concept of an assisted search in the papers of Dembski and Marks: it is not possible to encode the influence of an oracle or a fitness function into the measure Q, so most examples presented by the authors (like partioned search or easter egg hunt are not covered by this model. The performance of a search strategy P(Q,δT1) for finding an element T1∈Ω1 is defined as the probability to name T1 in one of the K guesses of the search.Some calculations
Using the characteristic function χ{n1,n2,…,nK}(T1)={1ifT1∈{n1,n2,…,nK}0otherwise this probability can be written as: P(Q,δT1)=∑(n1,n2,…,nK)∈ΩKq(n1,n2,…,nK)⋅χ{n1,n2,…,nK}(T1) The characteristic function is independent of the order of the elements n1,n2,…nK. So the expression can be streamlined using instead of the space of ordered n-tuples ΩK the space PK(Ω1) of subsets {n1,n2,…,nK} with K elements of Q1. This space has {N \choose K} elements. \mathbf{Q} on \Omega_K gives instantly rise to a measure on \mathfrak{P}_K(\Omega_1) - again called \mathbf{Q} - by setting: \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) = \sum_{\sigma \in S_K} q_{(\sigma_1(n_1),\,\sigma_2(n_2),\,\ldots,\,\sigma_K(n_K))}. Here, \sigma = (\sigma_1,\sigma_2,\ldots,\sigma_K) are the K! elements of the group of permutations of K elements, S_K. Thus, the probability to search successfully for T_1 is \mathcal{P}{(\mathbf{Q},\delta_{T_1})} = \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) \cdot \chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(T_1). But how does this search perform on average - assuming that the underlying measure is uniform? The underlying measure gives the probability with which a target element is chosen: if there is a uniform underlying measure \mathbf{U}_{\Omega}, each element 1,2,\ldots, N is a the target with a probability of \frac{1}{N}. The average performance is given by: \mathcal{P}{(\mathbf{Q},\mathbf{U}_{\Omega})} =\sum_{l=1}^N \frac{1}{N}\sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) \cdot \chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(l) = \frac{1}{N}\sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) \cdot \left(\sum_{l=1}^N \chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(l)\right) = \frac{1}{N}\sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) \cdot K =\frac{K}{N} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} \mathbf{Q}(\{n_1,\,n_2,\,\ldots,\,n_K\}) =\frac{K}{N}. This holds for every measure \mathbf{Q} on the space of the searches, especially for the uniform measure, leading to theSpecial Remark: If no information about a search for a single element exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in the same search performance as random search.
But perhaps this result depends on the special condition of an elementary target? What happens when T_m = \{t_1, t_2, \ldots, t_m\} \in \mathfrak{P}_m(\Omega _1)? Then a search is seen as a success if it identifies at least one element of the target T_m, thus \mathcal{P}{(\mathbf{Q},\delta_{T_1})} = \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} \cdot \max_{t \in T_m}\{\chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(t)\}. Calculating the average over all elements of \mathfrak{P}_m(\Omega_1) results in: \mathcal{P}{(\mathbf{Q},\mathbf{U}_{\mathfrak{P}_m(\Omega_1)})} = \frac{1}{{N \choose m}} \sum_{T \in \mathfrak{P}_m(\Omega_1)} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} \cdot \max_{t \in T_m}\{\chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(t)\}. =\frac{1}{{N \choose m}} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} \sum_{T \in \mathfrak{P}_m(\Omega_1)} \max_{t \in T_m}\{\chi_{\{n_1,\,n_2,\,\ldots,\,n_K\}}(t)\} =\frac{1}{{N \choose m}} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} \cdot \left({N \choose m} - {N-K \choose m}\right) = \frac{{N \choose m} - {N-K \choose m}}{{N \choose m}} \sum_{\{n_1,\,n_2,\,\ldots,\,n_K\} \in \mathfrak{P}_K(\Omega_1)} q_{\{n_1,\,n_2,\,\ldots,\,n_K\}} = \frac{{N \choose m} - {N-K \choose m}}{{N \choose m}} = 1-\frac{(N-K)(N-K-1)\ldots (N-K-m+1)}{N(N-1)\ldots (N-m+1)}. Again the probability of a successful search - and by definition the performance - is on average independent of the measure used on the space of the searches! (Note that the expression becomes \frac{K}{N} for m=1 and 1 for N-K < m). The result of this calculation is the
Less Special Remark: If no information about a search for a target exists other than the number of elements the target exists from, so that the underlying measure is uniform, then, on average, any other assumed measure will result in the same search performance as random search.
As the performance of the random search and any other search doesn't differ on any subset of targets of fixed cardinality, the next remark is elementary:
General Remark: If no information about a search exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in the same search performance as random search.
Obviously this General Remark negates the
Remark of Dembski and Marks: If no information about a search exists, so that the underlying measure is uniform, then, on average, any other assumed measure will result in negative active information, thereby rendering the search performance worse than random search.