Monday, January 29, 2018

The Search Problem of William Dembski, Winston Ewert, and Robert Marks

Introduction to Evolutionary Informatics, by Robert J. Marks II, the “Charles Darwin of Intelligent Design”; William A. Dembski, the “Isaac Newton of Information Theory”; and Winston Ewert, the “Charles Ingram of Active Information.” World Scientific, 332 pages.
Classification: Engineering mathematics. Engineering analysis. (TA347)
Subjects: Evolutionary computation. Information technology–Mathematics.1
Search is a central term in the work of Dr. Dr. William Dembski jr, Dr. Winston Ewert, and Dr. Robert Marks II (DEM): it appears in the title of a couple of papers written by at least two of the authors, and it is mentioned hundreds of times in their textbook "Introduction to Evolutionary Informatics". Strangely - and in difference from the other central term information, it is not defined in this textbook, and neither is search problem or search algorithm. Luckily, dozens of examples of searches are given. I took a closer look to find out what DEM see as the search problem in the "Introduction to Evolutionary Informatics" and how their model differs from those used by other mathematicians and scientists.

A Smörgåsbord of Search Algorithms

In their chapter 3.8 "A Smörgåsbord of Search Algorithms", DEM present
Table 3.2. A list of some search algorithms.
  • active set method38
  • adaptive coordinate descent39
  • alpha–beta pruning40
  • ant colony optimization41
  • artificial immune system optimization42
  • auction algorithm43
  • Berndt–Hall–Hall–Hausman algorithm44
  • blind search
  • branch and bound45
  • branch and cut46
  • branch and price47
  • Broyden–Fletcher–Goldfarb–Shanno (BFGS) method48
  • Constrained optimization by linear approximation (COBYLA)49
  • conjugate gradient method50
  • CMA-ES (covariance matrix adaptation evolution strategy)51
  • criss-cross algorith 52
  • cross-entropy optimization53
  • cuckoo search54
  • Davidon’s variable metric method55
  • differential evolution56
  • eagle strategy57
  • evolutionary programs58
  • evolutionary strategies
  • exhaustive search
  • Fibonacci search59,60
  • firefly algorithm61
  • Fletcher–Powell method 62
  • genetic algorithms63
  • glowworm swarm optimization64
  • golden section search65,66
  • gradient descent67
  • great deluge algorithm68
  • harmony search69
  • imperialist competitive algorithm70
  • intelligent water drop optimization71
  • Karmarkar’s algorithm72
  • Levenberg–Marquardt algorithm73
  • Linear, Quadratic, Integer and Convex Programming74
  • Nelder–Mead method75
  • Newton–Raphson method76
  • one-at-a-time search77
  • particle swarm optimization78
  • pattern search79
  • POCS (alternating projections onto convex sets) 80
  • razor search81
  • Rosenbrock methods 82
  • sequential unconstrained minimization technique (SUMT)83
  • shuffled frog-leaping algorithm84
  • simplex methods85
  • simulated annealing86
  • social cognitive optimization87
  • stochastic gradient search88
  • stochastic hill climbing89
  • Tabu search90
  • Tree search91
  • Zionts–Wallenius method92
A smörgåsbord indeed - and for me it is absolutely not clear how this list was constructed: DEM just write "[a]n incomplete list of search algorithms37 is provided in Table 3.2." and give as a footnote David Knuth's third volume of the "Art of Computing Programming: Sorting and Searching". But obviously, this list is not taken from the book, as
  1. Knuth definition of a search covers only a finite search-space:
    "In general, we shall suppose that a set of N records has been stored, and the problem is to locate the appropriate one. As in the case of sorting, we assume that each record includes a special field called its key."
  2. some of the methods were developed after 1973, the year Knuth's book was published according to DEM
I assume it never hurts to mention David Knuth. Fortunately, the footnotes in the table (which is listed at the end of the chapter) are a little bit more to the point. To save jumping back and forth, I added the given source to every item in the list in a second column. I looked up some of them and I tried to find out which kind of sort problem the authors of the paper have in mind - this, I put into the third column2.
methodsourcesearch problem
active set method J. Nocedal and S. Wright, Numerical Optimization (Springer Science & Business Media, 2006).optimization problem: $\min_{x \in \mathbb{R}^n}f(x)$ subject to $\begin{cases}c_i(x) = 0, &i \in \mathcal{E}\\c_i(x) \ge 0,& i \in \mathcal{I} \end{cases}$
adaptive coordinate descent I. Loshchilov, M. Schoenauer, and M. Sebag, “Adaptive Coordinate Descent.” In Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation (ACM, 2011), pp. 885–892.(separable) continuous optimization problems
alpha–beta pruning Donald E. Knuth and Ronald W. Moore, “An analysis of alpha-beta pruning.” Artif Intel, 6(4), pp. 293–326 (1976)."searching a large tree of potential continuations" (p. 234)
ant colony optimization M. Dorigo, V. Maniezzo, and A. Colorni, “Ant system: optimization by a colony of cooperating agents.” IEEE Transactions on Systems, Man, and Cybernetics — Part B, 26(1), pp. 29–41 (1996).stochastic combinatorial optimization (here)
artificial immune system optimization Leandro N. de Castro and J. Timmis, Artificial Immune Systems: A New Computational Intelligence Approach (Springer, 2002), pp. 57– 58.
auction algorithm Dimitri P. Bertsekas, “A distributed asynchronous relaxation algorithm for the assignment problem.” Proceedings of the IEEE International Conference on Decision and Control, pp. 1703–1704 (1985).
Berndt–Hall–Hall–Hausman algorithm Ernst R. Berndt, Bronwyn H. Hall, Robert E. Hall, and Jerry A. Hausman, “Estimation and inference in nonlinear structural models.” Annals of Economic and Social Measurement, 3(4), pp. 653–665 (1974).non-linear least squares problems
blind search
branch and bound Patrenahalli M. Narendra and K. Fukunaga, “A branch and bound algorithm for feature subset selection.” IEEE Transactions on Computers, 100(9), pp. 917–922 (1977).
branch and cut M. Padberg and G. Rinaldi, “A branch-and-cut algorithm for the resolution of large-scale symmetric traveling salesman problems.” SIAM Rev, 33(1), pp. 60–100 (1991).
branch and price Cynthia Barnhart, Ellis L. Johnson, George L, Nemhauser, Martin W.P. Savelsbergh, and Pamela H. Vance, “Branch-and-price: Column generation for solving huge integer programs.” Operations Research, 46(3), pp. 316–329 (1998).
Broyden–Fletcher–Goldfarb–Shanno (BFGS) method J. Nocedal and Stephen J. Wright, Numerical Optimization, 2nd edition (Springer-Verlag, Berlin, New York, 2006).
Constrained optimization by linear approximation (COBYLA) Thomas A. Feo and Mauricio G.C. Resende, “A probabilistic heuristic for a computationally difficult set covering problem.” Op Res Lett, 8(2), pp. 67–71 (1989).
conjugate gradient method A.V. Knyazev and I. Lashuk, “Steepest descent and conjugate gradient methods with variable preconditioning.” SIAM J Matrix Anal Appl, 29(4), pp. 1267–1280 (2007).linear system with a real symmetric positive definite matrix of coefficients A
CMA-ES (covariance matrix adaptation evolution strategy) Y. Akimoto, Y. Nagata, I. Ono, and S. Kobayashi. “Bidirectional relation between CMA evolution strategies and natural evolution strategies.” Parallel Problem Solving from Nature, PPSN XI, pp. 154–163 (Springer, Berlin Heidelberg, 2010).
criss-cross algorithm Dick den Hertog, C. Roos, and T. Terlaky, “The linear complimentarity problem, sufficient matrices, and the criss-cross method.” Linear Algebra Appl, 187, pp. 1–14 (1993).
cross-entropy optimization R.Y. Rubinstein, “Optimization of computer simulation models with rare events.” Eur J Ops Res, 99, pp. 89–112 (1997).

R.Y. Rubinstein and D.P. Kroese, The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning (Springer-Verlag, New York, 2004).

given a set $\mathcal{X}$, and an $\mathbb{R}-$valued function on $\mathcal{X}$, determine $\max_{\textbf{x} \in \mathcal{X}} S(\textbf{x})$ (here p.4)
cuckoo search X.S. Yang and S. Deb, “Cuckoo search via Lévy flights.” World Congress on Nature & Biologically Inspired Computing (NaBIC 2009). IEEE Publications, pp. 210–214. arXiv:1003.1594v1.
Davidon’s variable metric method W. C. Davidon, “Variable metric method for minimization.” AEC Research Development Rept. ANL-5990 (Rev.) (1959).
differential evolution P. Rocca, G. Oliveri, and A. Massa, “Differential evolution as applied to electromagnetics.” Antennas and Propagation Magazine, IEEE, 53(1), pp. 38–49 (2011).
eagle strategy Xin-She Yang and Suash Deb, “Eagle strategy using Lévy walk and firefly algorithms for stochastic optimization.” Nature Inspired Cooperative Strategies for Optimization (NICSO 2010) (Springer Berlin Heidelberg, 2010), pp. 101–111.
evolutionary programs Jacek M. Zurada, R.J. Marks II and C.J. Robinson; Editors, Computational Intelligence: Imitating Life (IEEE Press, 1994).
M. Palaniswami, Y. Attikiouzel, Robert J. Marks II, D. Fogel, and T. Fukuda; Editors, Computational Intelligence: A Dynamic System Perspective (IEEE Press, 1995).
evolutionary strategies
exhaustive search
Fibonacci searchDavid E. Ferguson, “Fibonaccian searching.” Communications of the ACM, 3(12), p. 648 (1960).
J. Kiefer, “Sequential minimax search for a maximum.” Proceedings of the American Mathematical Society, 4(3), pp. 502–506 (1953).
firefly algorithm Xin-She Yang, “Firefly algorithms for multimodal optimization.” In Stochastic Algorithms: Foundations and Applications (Springer Berlin Heidelberg, 2009), pp. 169–178.
Fletcher–Powell method R. Fletcher and M.J.D. Powell, “A rapidly convergent descent method for minimization.” Computer J. (6), pp. 163–168 (1963).
genetic algorithms David E. Goldberg, Genetic Algorithms in Search Optimization and Machine Learning (Addison Wesley, 1989).
R. Reed and R.J. Marks II, “Genetic Algorithms and Neural Networks: An Introduction.” Northcon/92 Conference Record (Western Periodicals Co., Ventura, CA, Seattle WA, October 19–21, 1992), pp. 293–301.
glowworm swarm optimization K.N. Krishnanand and D. Ghose. “Detection of multiple source locations using a glowworm metaphor with applications to collective robotics.” Proceedings of the 2005 IEEE Swarm Intelligence Symposium (SIS 2005), pp. 84–91 (2005).
golden section searchA. Mordecai and Douglass J. Wilde. “Optimality proof for the symmetric Fibonacci search technique.” Fibonacci Quarterly, 4, pp. 265–269 (1966).
A. Mordecai and Douglass J. Wilde. “Optimality proof for the symmetric Fibonacci search technique.” Fibonacci Quarterly, 4, pp. 265–269 (1966).
gradient descent Jan A. Snyman, Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms (Springer Publishing, 2005).constrained optimization problem $minimize_{w.r.t. \mathbf{x}}f(\mathbf{x})$, $\mathbf{x} = [x_1, x_2,\ldots, x_n]^T \in \mathbb{R}^n$ subject to $\begin{cases}g_j(\mathbf{x}) \le 0, & j=1,2, \ldots, m \\ h_j(\mathbf{x})=0,& j=1,2,\ldots,r\end{cases}$
great deluge algorithm Gunter Dueck, “New optimization heuristics: the great deluge algorithm and the record-to-record travel.” J Comp Phys, 104(1), pp. 86–92 (1993).
harmony search Zong Woo Geem, “Novel derivative of harmony search algorithm for discrete design variables.” Applied Mathematics and Computation, 199, (1), pp. 223–230 (2008).
imperialist competitive algorithm Esmaeil Atashpaz-Gargari and Caro Lucas, “Imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition.” 2007 IEEE Congress on Evolutionary Computation (CEC 2007), pp. 4661– 4667 (2007).
intelligent water drop optimization Shah-Hosseini Hamed, “The intelligent water drops algorithm: a natureinspired swarm-based optimization algorithm.” Int J Bio-Inspired Comp, 1(1/2), pp. 71–79 (2009).
Karmarkar’s algorithm Karmarkar Narendra, “A new polynomial-time algorithm for linear programming.” Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, pp. 302–311 (1984).
Levenberg–Marquardt algorithm Kenneth Levenberg, “A Method for the Solution of Certain Non-Linear Problems in Least Squares.” Quart App Math, 2, pp. 164–168 (1944).
Linear, Quadratic, Integer and Convex Programming Alexander Schrijver, Theory of Linear and Integer Programming (John Wiley & Sons, 1998).
Yurii Nesterov, Arkadii Nemirovskii, and Yinyu Ye, “Interior-point polynomial algorithms in convex programming.” Vol. 13. Philadelphia Society for Industrial and Applied Mathematics (1994).
given a subset $\Pi \subset \Sigma^* \times \Sigma^*$, where $\Sigma$ is some alphabet, then the search problem is: given string $z \in \Sigma^*$, find a string $y$ such that $(z,y) \in \Pi$, or decide that no such string $y$ exists.
Nelder–Mead method K.I.M. McKinnon, “Convergence of the Nelder–Mead simplex method to a non-stationary point.” SIAM J Optimization, 9, pp. 148–158 (1999).
Newton–Raphson method E. Süli and D. Mayers, An Introduction to Numerical Analysis (Cambridge University Press, 2003).
one-at-a-time search A.H. Boas, “Modern mathematical tools for optimization,” Chem Engrg (1962).
particle swarm optimization J. Kennedy and R. Eberhart, “Particle Swarm Optimization.” Proceedings of IEEE International Conference on Neural Networks IV, pp. 1942–1948 (1995).
pattern search A. W. Dickinson, “Nonlinear optimization: Some procedures and examples.” Proceedings of the 19th ACM National Conference (ACM, 1964), pp. 51–201.
POCS (alternating projections onto convex sets) Robert J. Marks II, Handbook of Fourier Analysis & its Applications (Oxford University Press, 2009).
razor search J.W. Bandler and P.A. Macdonsdd, “Optimization of microwave networks by razor search.” IEEE Trans. Microwave Theory Tech., 17(8), pp. 552–562 (1969).
Rosenbrock methods H.H. Rosenbrock, “An automatic method for finding the greatest or least value of a function.” Comp. J., 3, pp. 175–184 (1960).
sequential unconstrained minimization
technique (SUMT) John W. Bandler, “Optimization methods for computer-aided design.” IEEE Transactions on Microwave Theory and Techniques, 17(8), pp. 533–552 (1969).
shuffled frog-leaping algorithm Muzaffar Eusuff, Kevin Lansey, and Fayzul Pasha, “Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization.” Engineering Optimization, 38(2), pp. 129–154 (2006).
simplex methods M.J. Box, “A new method of constrained optimization and a comparison with other methods.” Computer J., (8), pp. 42–52 (1965).
J.A. Nelder and R. Mead, “A simplex method for function minimization.” Computer J., 7, pp. 308–313 (1965).
simulated annealing S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, “Optimization by simulated annealing.” Science, 220(4598), pp. 671–680 (1983).
social cognitive optimization X.-F. Xie, W. Zhang, and Z. Yang, “Social cognitive optimization for nonlinear programming problems.” Proceedings of the First International Conference on Machine Learning and Cybernetics, 2, pp. 779–783 (Beijing, 2002).
stochastic gradient search James C. Spall, Introduction to Stochastic Search and Optimization (2003).Find the value(s) of a vector $\mathbf{\theta} \in \Theta$ that minimize a scalar-valued $loss function$ $L(\mathbf{\theta})$ or: Find the value(s) of $\mathbf{\theta} \in \Theta$ that solve the equation $\mathbf{g}(\mathbf{\theta}) = \mathbf{0}$ for some vector-valued function $\mathbf{g}(\mathbf{\theta})$
stochastic hill climbing Brian P. Gerkey, Sebastian Thrun, and Geoff Gordon, “Parallel stochastic hillclimbing with small teams.” Multi-Robot Systems. From Swarms to Intelligent Automata, Volume III, pp. 65–77. (Springer Netherlands, 2005).
Tabu search F. Glover, “Tabu Search — Part I.” ORSA J Comput, 1(3), pp. 190–206 (1989). “Tabu Search — Part II”, ORSA J Comput, 2(1), pp. 4–32 (1990).
Tree search Athanasios K. Sakalidis, “AVL-Trees for Localized Search.” Inform Control, 67, pp. 173–194 (1985).
R. Seidel and C.R. Aragon, “Randomized search trees.” Algorithmica, 16(4–5), pp. 464–497 (1996).
Zionts–Wallenius method S. Zionts and J. Wallenius, “An interactive programming method for solving the multiple criteria problem.” Manage Sci, 22(6), pp. 652–663 (1976).
Often the quoted texts are scientific papers: those expect their readers to be accustomed to the general framework of their specific problems, and will not define the term "search problem" from scratch. Instead, they will just mention the specific problem which they tackle - like statistical combinatorial optimization.

But there are some textbooks, too: I tried to look them up and to quote what the authors of those define as their search problem. Nocedil and Snyman both describe the classical optimization problem: here, the search space is a subset of an $n$-dimensional vector space $V$ over $\mathbf{R}$ - as $V$ is restricted by some (in)equations. Finding the target means minimizing an $R$-valued function on this set.

On the other hand, Macready and Wolpert - in their classical paper "No Free lunch Theorems for Optimization"1 - look at two finite sets $\mathcal{X}$ and a sortable $\mathcal{Y}$ and wish to minimize a function $f: \mathcal{X} \rightarrow \mathcal{Y}$ by finding the $x \in \mathcal{X}$ such that $f(x)$ is minimal.

What have all these approaches in common? A set to search on and a function which can be optimized. In most cases, the range of the function is $\mathbb{R}$, $\mathbb{Z}$, or $\mathbb{N}_0$, but for some problems (as mentioned in section 3.7.2. Pareto optimization and optimal sub-optimality of Introduction into Evolutionary Informatics), another partially ordered set will be used. I will ignore this for the time and just choose an ordered set for my definition of a optimization problem which should cover virtually all the cases discussed above:

General Optimization Problem
  • a set $\Omega$
  • an ordered set $\mathcal{Y}$ and
  • a function $f: \Omega \rightarrow \mathcal{Y}$
find $x \in \Omega$ such that $f(x) = \min f$
As said before, optimizing and searching are virtually the same, but to stress the character of a search I introduce a target - something, which is mentioned in all the searches of DEM. So, my search problem is:
General Search Problem
  • a set $\Omega$
  • an ordered set $\mathcal{Y}$
  • a target $T \subset \Omega$ and
  • a function $f: \Omega \rightarrow \mathcal{Y}$ such that $T = \{\tau \in \Omega | f(\tau)=\min f \}$
find $x \in T$
Nothing substantial has changed, the definition just became a little more verbose. I am quite sure that most authors of the papers on the table would accept this as a good attempt of a definition - but is it the search problem which DEM have in mind?

On page 48, they provide an example of a search credited to Walter Bradley:

Kirk is an armadillo foraging for grubs when he is bitten by a spider that makes him blind. Kirk wants to return to his armadillo hole, but is disoriented. He knows, though, that his hole is at the lowest elevation in the immediate area, so he balls up and rolls downhill to his hole. When Kirk does this, he is not explicitly seeking his hole. His surroundings are fortuitously designed to take him there. Kirk’s target is thus implicit in the sense it is not specifically sought, but is a result of the environment’s action on him. He can bounce off of trees and be kicked around by playful kids. And repeated trials of rolling down the hill might take drastically different paths. But ultimately, Kirk will end up in his hole at the bottom of the hill. Kirk reaches his home because of information he acquires from his environment. The environment must be designed correctly for this to happen.
Here, $\Omega$ is Kirk's habitat, $f$ is given as the elevation. What is surprising is that DEM make a distinction between the minimum of the function $f$ and Kirk's intended target $T$, his borrow hole. Luckily, both coincide, but DEM imply that this is not necessarily the case!

Next, they revisit their "pancake search example": here, the taste of the pancake as a function depends smoothly on a variety of factors like amount of ingredients, oven temperature, baking time, etc. - the possible combinations of which make up $\Omega$. On this $\Omega$, a cook looks for the best taste by optimizing the taste function. Now, they restrict $\Omega$ by additional conditions to $\Omega'$, such that the original extreme of $f$ does not lie in the new restricted set.

For the definitions of optimization/search problem above, this does not pose a problem: there is now the set $\Omega'$ to search on, looking for the optimum of $f|_{\Omega'}$. Though the new solution will taste worse than the original one, the new target is the solution of the new restricted problem.

Not so for DEM: "If, however, the environment is constrained in a negative way, the target may never be found even if it was available prior to the alteration."

That is the great difference between the problems which all other scientists discuss and the ones of DEM: DEM have decoupled the optimum of the function and the target, arriving quite another version of a search problem:

DEM's Search Problem
  • a set $\Omega$
  • an ordered set $\mathcal{Y}$
  • a target $T \subset \Omega$ and
  • a function $f: \Omega \rightarrow \mathcal{Y}$
find $x \in T$

The Problems with DEM's Search Problem

First there is of course the problem of applicability: it is not clear how any of DEM's results is relevant for the problems in the table as those concern fundamentally different problems.

Then there is a problem of procedure: for an algorithm for a search or optimization, generally some information about $\Omega$ is given and (a finite number of) values of $f$ can be obtained. If $T$ is independent of $f$, how is it ever possible to say that a target was hit? This additional information can only be given ex cathedra afterwards!

Not every one of the search algorithms stated in the table will always identify the target, but in many cases, this is possible - at least theoretically: if possible, an exhaustive search will always give you the target. Not so for DEM: even if you have calculated $f$ for all elements of $\Omega$ and found the optimum, this does not have to be the intended target which still has to be revealed.

Why do DEM use their definition?

I would like to answer this question using Dawkins's weasel. Then
  • $\Omega$ is the set of strings consisting from 28 letters chosen from the alphabet ABCDEFGHIJKLMNOPQRZ plus * as a sign indicating a space
  • $f$ is given by the number of correct letters - a number from 0 to 28.
Imagine someone has programmed an algorithm using $f$ which will find the target in 100% of all runs. The big question: How will it fare for the target string I*REALLY*DO*NOT*LIKE*WEASELS?
  1. My answer would be fantastic: if I*REALLY*DO*NOT*LIKE*WEASELS is the target, then it is the optimum of $f$, so $f$ for finding this phrase is the number of common letters with I*REALLY*DO*NOT*LIKE*WEASELS...
  2. DEM's answer would be abysmal: though the target is I*REALLY*DO*NOT*LIKE*WEASELS, $f$ still is defined as the number of common letters with METHINKS*IT*IS*LIKE*A*WEASEL. The algorithm would result in METHINKS*IT*IS*LIKE*A*WEASEL
The advantage for DEM is stated on p. 173 "We note, however, the choice of an algorithm along with its parameters and initialization imposes a probability distribution over the search space." Indeed, it does in their case - and it will not work with my definition. This probability distribution may appear absolutely counterintuitive to any practitioner of optimization problems, but it is the basic building block for many of DEM's most important results.

How does DEM's search problem work for evolution?

Some interesting characters make a cameo in DEM's textbook: not only Bob and Monica, the pirates X, Y, and Z, Melody and Maverick, but also God and Abraham. In this spirit I would like to invent a dialogue between God and Darwin's Bulldog:
Bulldog:"The horse is a marvellous creature: fully adapted to its niche, really, survival of the fittest at play"
God: "Oh know, that one is a total disaster - it may function better than any other creature in its environment, but I was aiming for pink unicorns"
For short: I think that DEM's model does not work for the usual optimization and search problems in mathematics. It is even worse as a model applied to the real world.

Perhaps these are all strawmen?

It could be that I have erected an elaborate strawman, and that the search problem which I attributed to DEM has nothing to do with their ideas. In this case, it should be easy for DEM - or their apologists - to come forward with their definition. Or perhaps - if I am right - they may just wish to explain why their model is not horrible.

1. Still thankful for the nice header, Tom!
2. Obviously, the work is not completed yet. I will look up more in the future - and I will be grateful for any contribution to this project!
3. David H. Wolpert, William G. Macready "No Free Lunch Theorems for Optimization", IEEE Transactions on Evolutionary Computation Vol. 1, No. 1, April 1997, p. 68

1 comment:

  1. Hi DiEb,

    Two areas where target sets are detached from objective function extrema are surrogate functions (i.e., when you want to optimize one function, but it is intractable, so you settle for a surrogate whose extrema are hopefully close to the true target), and unreliable objective functions (such as noisy evaluations, which means you may actually be optimizing towards the wrong target). These are ever present issues in applied data science and I'm sure you can think of other cases without too much effort. So, allowing different possible relationships between target sets and information resources like objective functions makes sense. If nothing else, it is a more flexible model that includes the usual state of affairs where the target set is fixed to be the minimum/maximum of the function as a special case. What matters, in the end, is the degree of dependence between the target set and the objective function, namely, how much information about the target can you gain by exploring the objective function?

    Since DEM have evolutionary concerns in mind, it also makes sense that their targets need to be defined independently from the objective function, or else when we say "Evolution can find functional small targets in large spaces, such as eyes, echolocators, wings, and flagella" we are essentially making a tautological statement, because the targets of evolution are defined simply to be the things it produces. Of course evolution can produce what it produces. However, these targets are identifiable independently of the fitness functions of nature, and there is no a priori guarantee that such fitness functions will have maxima that coincide with these targets (and have smooth monotonically increasing fitness slopes leading up to them). If you want to say the targets are exactly those places where the fitness functions are maximized, there is no theoretical guarantee that the targets will exactly overlap with the set of living things. Unless you can show otherwise. I'm open to being persuaded.