Code (.sea.hqx)
Genetically Seeking Sparse Rulers
David Fowler
University of Nebraska-Lincoln
- inch ruler with markings at every inch, one can measure any integral number of inches from 1 to n. A little thought shows that not all of those marks are necessary. Rulers with minimal or nearly minimal numbers of markings are called "sparse rulers." This notebook contains the code for finding sparse rulers using a genetic algorithm.
The procedure in a genetic algorithm is to work iteratively toward a solution to a problem. At any given iteration, a family of possible solutions is evaluated for their "fitness," that is, how close they come to satifying the conditions of the problem. The algorithm selects the fittest solutions and then recombines these solutions to produce a new generation of solutions. In this way a solution to the problem "evolves."
References to genetic algorithms [Holland, 1992], genetic programming [Koza, 1992], and further general questions of evolution and order [Kauffman, 1993] can be found at the conclusion of this notebook.
The sparse ruler problem is ideal for illustrating the process of genetic solution. In the following notebook, a ruler for the number p is simply represented as a list beginning with 0 and ending with p. The case for p = 36 is interesting, because there is exactly one minimal sparse ruler for this value.
SeedRandom[1]
As an example, consider the ruler for p = 36 for the following arbitrarily selected values:
p=36;
r = {0, 1, 9, 18, 27, 36};
The fitness of this ruler is measured by comparing the number of marks in the ruler with the number of integers the ruler will measure. The number of marks in the ruler (including 0) is just the Mathematica Length[] function. The set of all numbers that the ruler will measure is found with the defined function span[]. One way to find this set is to map the absolute value function across all differences generated by a given ruler list.
absDiff[{s_,t_}]:=Abs[s-t];
span[s_List]:=Union[Map[absDiff,Flatten[Outer[List,s,s],1]]];
For the example r,
span[r]
{0, 1, 8, 9, 17, 18, 26, 27, 35, 36}
The function fitness[] can now be defined as a function that will increase when either the number of marks in the ruler decreases or the number of integers measured by the ruler increases.
fitness[s_List]:=((p+1)-Length[s])/((p+2)-Length[span[s]]);
For the example given here, the fitness value is a little greater than 1.
fitness[r]
31/28A ruler containing marks at every point from 0 to 36 would have a fitness equal to 0. We know that a unique sparse ruler with ten marks exits for p
= 36. The length of the span of this ruler would be 37, and the resulting fitness value would be 27.
Through a rather loose usage of biological terminology, our rulers can be thought of as "chromosomes," and each number in the ruler set as a "gene." The function donate[] is a "crossing-over" function that pastes together parts of two rulers. This function occasionally leaves out some integer common to both rulers, resulting in a "mutation by omission." The function newGene[] will result in "mutation by insertion" of a new integer. The rate is fixed at a rather high value, 0.7 in this case. The function mate[] combines both types of mutation. This function also guarantees that the values 0 and p will be retained in any new combination. Finally, the offSpring[] function generates a new set of n rulers by "mating" a given pair of parent rulers.
donate[s_List,t_List]:=Block[{ri1=Random[Integer,Min[Length[s],
Length[t]]],
ri2=Min[ri1+
Round[Random[Real,0.7]],
Length[t]]},
Union[Take[s,ri1],
Drop[t,ri2]]];
newGene[]:=Random[Integer,p] * Round[Random[Real,0.7]];
mate[{s_,t_}]:=Union[{newGene[]},{0,p}, donate[s,t]];
offSpring[{s_,t_},n_]:=Table[mate[{s,t}],{n}];
The example below shows a generation of five new rulers resulting from the combination of two parent rulers. The fitness[] function tells us which two to select.
r1 = {0, 1, 2, 15, 29, 36};
r2 = {0, 3, 7, 9, 31, 33, 36};
offSpring[{r1, r2}, 5]
{{0, 1, 2, 15, 29, 36}, {0, 1, 7, 9, 18, 31, 33, 36},
{0, 1, 2, 15, 31, 33, 36}, {0, 1, 2, 4, 15, 29, 36},
{0, 1, 2, 9, 31, 33, 36}}
Map[fitness,%]//N
{1.29167, 2.23077, 1.57895, 1.57895, 1.57895}
The scheme is to begin with two original rulers, labeled betaRuler and alphaRuler, then recursively generate new rulers in population sizes equal to 5. [This value, like the mutation rate of 0.7 above, was arrived at by experimenting. I will return to a discussion of these values later]. The two fittest rulers from each generation are indexed as newBetaRuler[i], and newAlphaRuler[i], where fitness[newBetaRuler[i]] < or = fitness[newAlphaRuler[i]]. Notice the use of delayed assignment ( := ) followed by immediate assignment ( = ) in setting up the recursion.
Since the ruler labeled betaRuler={0,p};
alphaRuler={0,p};
rulerGeneration[0]={betaRuler,alphaRuler};
rulerGeneration[n_]:=rulerGeneration[n]=
offSpring[
Take[Sort[
rulerGeneration[n-1],
fitness[#1]newAlphaRuler[i] will have the best fitness value, a plot of these values will indicate how well the generations of rulers are evolving toward a solution.
lpf15 =ListPlot[Table[fitness[newAlphaRuler[i]],{i,15}],PlotJoined->True];

The graph shows that the rulers are evolving in terms of increasing fitness. The evolution is obviously not monotonic, however. The best ruler in this set of generations is newAlphaRuler[15].
newAlphaRuler[15]
{{0, 10, 19, 24, 28, 29, 35, 36}
The ruler is reasonably short.
Length[%]
8The ruler will measure 23 of the numbers between 0 and 36.
span[newAlphaRuler[15]
{0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 17, 18, 19, 24, 25, 26,
28, 29, 35, 36}
Complement[Table[i, {i, 0, 36}],span[newAlphaRuler[15]]]
{2, 3, 13, 15, 20, 21, 22, 23, 27, 30, 31, 32, 33, 34}
Length[span[newAlphaRuler[15]]]
23We can see from the fitness value that there is still considerable room for improvement. [Recall that we should be able to find fitness values up to 27].
fitness[newAlphaRuler[15]]
29/15This fitness function seems to select quickly the rulers that will measure all or most of the numbers between 0 and a given value p. Here is a typical plot of 100 generations.
ListPlot[Table[fitness[newAlphaRuler[i]],{i,100}],PlotJoined->True];

Here are the last eleven rulers from this set of generations. They all measure the entire set of numbers between 0 and 36. The difference in their fitness values results from the differences in length.
Table[newAlphaRuler[i],{i,85,95}]
{{0, 1, 2, 18, 19, 21, 24, 26, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 26, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 26, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 26, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 27, 30, 31, 33, 36},
{0, 1, 2, 18, 19, 21, 24, 27, 30, 31, 33, 36}}
Map[Length,Map[span,%]]
{37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37}
Map[fitness,%%]//N
{24., 24., 24., 24., 25., 25., 25., 25., 25., 25., 25., 25.}
The values selected for population size [5] and mutation rate [0.7] result in intermittant strings of identical rulers. Notice also that different rulers may have the same fitness, therefore we need a way to display the output of large series of generations. These series will be displayed graphically in a later section of this notebook.
The evolutionary process goes through periods of stability and periods of fluctuation, as we can see by examining a plot of 1,000 generations.
fitnessTable = Table[fitness[newAlphaRuler[i]],{i,1000}];
fitnessPlot = ListPlot[fitnessTable,
PlotJoined->True,
PlotRange->{0,29}];
newAlphaRuler[707]
{0, 1, 3, 8, 15, 16, 26, 30, 32, 35, 36}
fitness[%]
26
span[newAlphaRuler[707]]
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36}
Length[%]
37To create an image of the rulers themselves, we first convert them to binary strings.
bitPlane = Table[0,{i,37}];
convertToBin[x_List]:=ReplacePart[bitPlane,1,
Partition[Plus[1,x],1]];
Thus the ruler {0, 29, 36} would have the respective binary representation:
convertToBin[{0, 29, 36}]
{1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1}
We can use the Mathematica ListDensityPlot[15] function to transform numeric lists into "actual" rulers:
ListDensityPlot[15] function, we obtain this picture:
If we turn this plot sideways, we can match each ruler to its fitness value:
ldp15 = ListDensityPlot[-Transpose[
Map[convertToBin,Table[newAlphaRuler[i], {i,15}]]]];

Show[GraphicsArray[{{ldp15},{lpf15}},GraphicsSpacing->0.0]];

For the entire set of 1000 generations, we have the following picture:
ldp1000 = ListDensityPlot[-Transpose[
Map[convertToBin,Table[newAlphaRuler[i],{i,1000}]]],
Mesh -> False];
Show[GraphicsArray[{{ldp1000},{fitnessPlot}},
GraphicsSpacing->0.0]];
For small values of p, we can look at an exhaustive list of rulers for the number p. A convenient way to generate these rulers is by using the GrayCode[] function from the DiscreteMath`Combinatorica` package. The GrayCode[] function generates a collection of sets such that adjacent sets in the list differ by only one element. From this list of sets the function rulerList[] constructs a set of rulers for the number p, such that two adjacent rulers in the list differ by only one integer.
<<DiscreteMath`Combinatorica`
setList[i_]:=setList[i] = GrayCode[Range[i-1]];
rulerList[i_]:= rulerList[i] =Map[Union[#,{0,i}]&,setList[i]];
rulerList[4]
{{0, 4}, {0, 1, 4}, {0, 1, 2, 4}, {0, 2, 4}, {0, 2, 3, 4},
{0, 1, 2, 3, 4}, {0, 1, 3, 4}, {0, 3, 4}}
The fitness values of the set of all rulers for a given p can be plotted as a "landscape" that shows how the fitness varies across a set of rulers. For example, when p = 10, the maximum fitness is 5. The plot below shows the fitness values for the 2 ^9 rulers arranged so that each ruler differs from its neighbor by one element.
p=10;
fitnessList = Map[fitness,rulerList[10]];
landScape = ListPlot[fitnessList, PlotJoined->True,
PlotRange->All];

The table also shows that for p = 10 there are multiple rulers with the maximum fitness value. The Position[] function shows the location of these rulers. The list of locations can be used as an index to show all the sparse rulers for p = 10.
Position[landScape, Max[landScape]]
{{59}, {67}, {99}, {115}, {123}, {135}, {159}, {191}, {199},
{207}, {223}, {227}, {239}, {247}, {265}, {271}, {281}, {285},
{313}, {319}, {369}, {377}, {389}, {397}, {399}, {415}, {445},
{447}, {451}, {455}, {461}, {477}, {485}, {487}, {493}, {499},
{501}, {503}}
sparserulers = rulerList[10][[Flatten[%]]];
{{0, 1, 2, 3, 6, 10}, {0, 1, 2, 6, 7, 10}, {0, 1, 2, 5, 7, 10},
{0, 1, 2, 4, 7, 10}, {0, 1, 2, 3, 7, 10}, {0, 1, 3, 7, 8, 10},
{0, 1, 5, 7, 8, 10}, {0, 1, 6, 7, 8, 10}, {0, 1, 3, 6, 8, 10},
{0, 1, 4, 6, 8, 10}, {0, 1, 5, 6, 8, 10}, {0, 1, 2, 5, 8, 10},
{0, 1, 4, 5, 8, 10}, {0, 1, 3, 4, 8, 10}, {0, 3, 4, 8, 9, 10},
{0, 1, 4, 8, 9, 10}, {0, 3, 5, 8, 9, 10}, {0, 2, 5, 8, 9, 10},
{0, 3, 6, 8, 9, 10}, {0, 1, 6, 8, 9, 10}, {0, 4, 7, 8, 9, 10},
{0, 3, 7, 8, 9, 10}, {0, 2, 3, 7, 9, 10}, {0, 2, 4, 7, 9, 10},
{0, 1, 4, 7, 9, 10}, {0, 1, 5, 7, 9, 10}, {0, 2, 6, 7, 9, 10},
{0, 1, 6, 7, 9, 10}, {0, 1, 2, 6, 9, 10}, {0, 1, 3, 6, 9, 10},
{0, 2, 4, 6, 9, 10}, {0, 2, 5, 6, 9, 10}, {0, 2, 3, 5, 9, 10},
{0, 1, 3, 5, 9, 10}, {0, 2, 4, 5, 9, 10}, {0, 1, 2, 4, 9, 10},
{0, 2, 3, 4, 9, 10}, {0, 1, 3, 4, 9, 10}}
Running a new experiment for p = 10 and plotting the fitness values of the best ruler at each generation shows that the evolution of rulers for this value of p can reach a maximum fitness by about the 20th generation.
alphaRulerList=Table[newAlphaRuler[i],{i,0,100}];
ListPlot[Map[fitness,alphaRulerList],PlotJoined->True,
PlotRange->All];

One can show how the evolving rulers migrate into the fitness landscape. alphaRulerListReduced produces an abbreviated list of the best rulers, removing successive identical rulers while maintaining the order in which the rulers have evolved.
alphaRulerListReduced=Block[{eLR={newAlphaRuler[0]}},
Do[If[alphaRulerList[[i]] != alphaRulerList[[i-1]],
eLR = Append[eLR,alphaRulerList[[i]]]],
{i, Length[alphaRulerList]}];
eLR]
{{0, 10}, {0, 8, 10}, {0, 7, 8, 10}, {0, 2, 7, 8, 10},
{0, 1, 4, 8, 10}, {0, 2, 3, 8, 10}, {0, 2, 3, 7, 8, 10},
{0, 2, 3, 7, 8, 9, 10}, {0, 3, 7, 8, 9, 10}, {0, 2, 3, 7, 9, 10},
{0, 3, 7, 8, 9, 10}, {0, 1, 3, 7, 8, 10}, {0, 3, 7, 8, 9, 10},
{0, 1, 3, 7, 8, 10}, {0, 3, 7, 8, 9, 10}}
The set sites lists the horizontal values of the positions of the rulers, while sitePoints pairs each horizontal value with its corresponding fitness. siteLines traces the evolution of the rulers through the landscape.
sites=Flatten[Table[Position[rulerList[10],
alphaRulerListReduced[[i]]],
{i,Length[alphaRulerListReduced]}]]
{1, 256, 129, 132, 242, 252, 133, 380, 377, 389, 377, 135, 377, 135,
377}
sitePoints=Graphics[{PointSize[0.02],Map[Point,Table[
{sites[[i]],fitness[alphaRulerListReduced[[i]]]},
{i, Length[alphaRulerListReduced]}]]}];
siteLines=ListPlot[Table[
{sites[[i]],fitness[alphaRulerListReduced[[i]]]},
{i, Length[alphaRulerListReduced]}],PlotJoined->True]

Lightening the color of the landscape makes it easier to distinguish the evolutionary path.
-Graphics-landScapeGray = ListPlot[fitnessList, PlotJoined->True,
PlotStyle->GrayLalphaRulerl[0.5],
PlotRange->All];
Show[{landScapeGray},{siteLines,sitePoints}]
A more general way of thinking about sparse rulers and the fitness landscape is to define the rulers as binary strings. Each mark i on a ruler is represented by a 1 in the ith position of the binary string. For example, the ruler {0,1,3,7,8,10} would be represented by the string {1,1,0,1,0,0,0,1,1,0,1}. Since the rulers to measure all numbers up to the value p always include the numbers 0 and p, the binary representations of the rulers always include a 1 in the first and pth position. The fitness function can therefore be thought of as being defined on a hypercube of dimension p -1. The genetic evolution of rulers corresponds to a tour through this hypercube.
Since the values of 2^(p -1) rapidly become too large, the entire fitness landscape for large values of p cannot be pictured. However, "sketches" of the landscape can provide useful information. For example, every fitness landscape will include the fitness value of the ruler containing all possible marks. This is the "zero valley" located in the graph above at the point 341 on the horizontal axis. There are also zones that are relatively rich in high fitness values, such as the region from 442 to 502 on the horizontal axis. Instead of always beginning with the zero generation rulers as {0, p}, collections of rulers could be seeded in the most promising zones of the landscape.
The numbers of offspring and the mutation rate could vary with the fitness of the evolving rulers. If generation size and mutation rate were to vary inversely with fitness, the