Weasels!

Richard Dawkins proposed a simple computer program to simulate evolution, and to have it generate the phrase "Me thinks it is like a weasel" ("The Blind Watch Maker"). This simple variant produced the string in about 50-60 generations on average.

Having a severe fascination for Genetic Algorithms, and Genetic Programming, I thought I'd try the weasel-program, but using a more "standard GA recepie (see for instance: Machine Learning by Tom M. Mitchell)".

The resulting prgroam uses proabilistic selection of individuals in the population for survival and procreation into the next generation. The probability that a given hypothesis Hi is defined as follows:

P(Hi) = Fitness(Hi) / SFit

where: Fitness(Hi) is the fitness value for a given hypothesis, and

SFit is the sum of fitnesses for all individuals in a generation.

The program also supports elitism in selection, that is, for any generation a given number of the best hypothesis are automatically copied into the next generation, thus the risk of losing good solutions is lessened. Setting the "elitist#" to 0 disables elitism of course. The fitness function is simply the inverse of the sum of the square of the character code distances. In other words, for each position i in the target string and the Hypothesis, take the difference in ASCII values (I.e A-B = 65 - 66 = -1), square it (Ex: (A-B)^2 = (65 - 66)^2 = (-1)^2 = 1), and sum them and take the inverse. If the individual has a 100% match with the target string the sum would be 0, and the inverse would not be defined. So for this special case the fitness is set to 2. This means that fitness would be 2 (total fit) or Fit(Hi) would be in the interval [1.0 ; 0[. A drawback of this fitness fucntion is that it is biased to give good results for strings where the space is in the right position, and to give very low results when a space is in the wrong position. This is obivous from the ASCII values for space (32) and for the first letter, A (65). The minimum distance between a letter and space is 33, while the maximum distance between two letters is 26. This could easily be redone, but I see no need in such a simple program as this.

Sexual recombination (crossover in GA nomenclature) is done as follows. Different ratios of the sexually recombining individuals will undergo different crossover-operators (Once again, see "Machine Learning" by Mitchell):

RatioOperation
0.4One point crossover
0.4Two point crossover
0.2Uniform crossover

Mutation is a simple point-mutation operator. Select (randomly) the position to mutate, and select (randomly) the new character to replace the old one.

One could imagine assing more mutation operators, like inversion, fram shifts and similar, but for now only a simple point mutator is effect.

Using the "original" starting values of the program, will rteach the traget string in about 2000-2500 generations. Using the following values will reach it in 250-300 generations. Further tweaking of the parameters can give even quicker convergance towards the target string.

VariableValue
Population size500
Max generations1000
# of "elites"20
Ratio of pop. sexually reproducing0.65
Mutation rate0.025

Clicking this will allow download of the jar-file ciontaining both class files and source code (warning, the source is not very clear or well engineered).

Clicking this link will run the Weasel-applet.

Tha applet uses Swing so a recent JDK mis a godo idea to have. The program can be run directly from the jar, like this:

java -cp ./weasel.jar WeaselFrame