Domination

Which house do you want:

Lets do it again. Given Fig4 of this paper what options would you reject?

Congraulations, you've just worked out how to do multi-objective optimization

Build Popluation[ i=0 ] by making lots of guesses
Experiment: (mutate, cross over)
Select the best results (a.k.a know as find the Pareto frontier)
Set Population[ i+1 ] to the frontier.
Goto step1

"Give me the fruitful error any time, full of seeds, bursting with its own corrections. You can keep your sterile truth for yourself."
― Vilfredo Pareto

"For many events, roughly 80% of the effects come from 20% of the causes."
― Vilfredo Pareto

Domination measures:

Binary Domination

X domiantes Y if better on at least one, and worse on none.
Returns a boolean.
Widely used. Often stops distinguishing things after 3+ goals.
Insenstive to the size of the domnation

Indicator Domiantion

From Zitler 2004:

X dominates Y if moving X to Y loses less than moving Y to X.
- To make the signal stronger, we shout the difference (raise it to some expendial power).
- To gauge the overall signal, we average across all goals
- To handle goal we want to minimize, maximize, we add a "w" constant (-1,1).
Returns a number.
Better at distinguishing higher problems with 3+ goals, see Table8.
To rank rowX, I
- N=100 times (say), pick rowY at random.
  - Q: Why not comapre to all rows?
- If indicatorDomiantes( X,Y ) then +1/N to rowX
- Then I add a new dom column to the data, wheich we want to maximize.
- see dom.html

For example, here are some rows of auto.csv:

  %cylinders   displacement         horsepower   <weight   >acceltn   model   origin   >mpg
  8           340:360              150:160      3609      8          70:70   1        10
  6           198:225              83:86        2587      16         70:70   1        20
  8           383:455              165:230      4425      10         70:70   1        10
  8           302:305              120:140      3449      10.5       70:70   1        20
  6           198:225              92:97        2833      15.5       70:70   1        20
  8           383:455              165:230      3850      8.5        70:70   1        20
  4           104:114              92:97        2372      15         70:70   3        20
  4           104:114              92:97        2375      17.5       70:70   2        30
  8           340:360              165:230      3693      11.5       70:70   1        20
  ....

From the first row, we see we want to minimize weight and maximize acceleration and mpg. From the other rows, we see that some discretizer has gotten to the displacement and horsepower values are replaced them with some string describing a range e.g.lo:hi = 340:360.

Here's the same data, with dom score added. Shown here are the 5 best and worst rows.

%cylinders  displacement  horsepower  <weight  >acceltn  model  origin  >mpg  >dom
8           383:455       165:230     4746     12        71:71  1       10    0
8           383:455       165:230     4951     11        72:73  1       10    0
8           383:455       165:230     4952     11.5      72:73  1       10    0
8           383:455       165:230     4955     11.5      71:71  1       10    0
8           383:455       165:230     5140     12        71:71  1       10    0
8           383:455       165:230     4354     9         70:70  1       10    0.01
8           383:455       165:230     4425     10        70:70  1       10    0.01
8           383:455       165:230     4464     11.5      71:71  1       10    0.01
8           383:455       165:230     4735     11        72:73  1       10    0.01
8           383:455       165:230     4906     12.5      72:73  1       10    0.01
..          ...           ...         ...      ..        ...    ...     ...   ...
4           85:91         69:72       2070     18.6      78:78  3       40    0.98
4           85:91         ?           1835     17.3      80:80  2       40    0.98
4           68:85         46:65       1825     18.6      77:77  2       40    0.99
4           68:85         69:72       1613     18        71:71  3       40    0.99
4           85:91         46:65       2335     23.7      80:80  2       40    0.99
4           85:91         46:65       1968     18.8      80:80  3       40    1.0
4           85:91         46:65       1975     19.4      81:81  3       40    1.0
4           85:91         46:65       1985     21.5      78:78  2       40    1.0
4           85:91         46:65       2085     21.7      80:80  2       40    1.0
4           96:97         46:65       2130     24.6      82:82  2       40    1.0

Observe that the highest dom scores are assocaited wiht rows with least weight, most acceleration and most mpg (and the lowest dom scores are associated with the reverse).

The Lay of the Land

The real story is that underneath surface features of all these problems is a common mathematical structure called, you guessed it, the landscape.

fit4

To know the landscape is to know how to optimize, how to avoid getting stuck on being over-adapted (hence overspecialized) on some local peak, when as Stewart Brand so aptly puts it...

"Whole mountain ranges of opportunity could be glimpsed in the distance, but getting to them involved venturing 'downhill' into regions of lower fitness".

Studying such landscapes made Brand distrust claims for "optimality" since what you call "optimum" may actually be just a stepping zone to a better place.

Brand's favorite landscape comes from a 1932 genetics paper that discusses how different breeding strategies respond well (or badly) to environmental pressures. In the following, the x-y axis might be "length of hair" and "body weight" and the z-axis might "probability of winning a fight".

wright

Says Brand:

"The first two illustrate how low selection pressure or a high rate of mutation (which comes with small populations) can broaden the range of a species whereas intense selection pressure or a low mutation rate can severely limit a species to the very peak of local fitness. The third diagram shows what happens when the landscape itself shifts, and the population has to evolve to shift with it."
"The bottom row explores how small populations respond to inbreeding by wandering ineffectively. The best mode of exploration Wright deemed the final diagram, showing how a species can divide into an array of races that interact with one another. That jostling crowd explores well, and it can respond to opportunity."

Holes, poles, saddles, local minima, flat, brittle

So to understand search, understand the landscape. If you know the landscape, you can see where it can trap and where it can help us out. One such trap is the saddle which, in the above diagram is the flat space between the mountain (called a pole) and the hole next to it.

Note that if walk along the saddle, you might think you are in a stable space of solutions. But be warned, one slip to the left or right and the landscape changes dramatically. In the above space you might fall into a hole or a pole.

Another trap is the local minima that seems like a good idea but if you get sucked into it, you may never find the much better place:

local

Another bad landscape is one that is completely flat. Try as you like, you walk along way around this one before finding anything better or worse:

flat

The opposite of flat is bumpy:

bump

Bumpy landscapes are common so Harman comments that understanding the neighborhood of our solutions is an open and pressing issue in search-based software engineering (SBSE):

"In some software engineering applications, solution robustness may be as im-portant as solution functionality. For example, it may be better to locate an area of the search space that is rich in fit solutions, rather than identifying an even better solution that is surrounded by a set of far less fit solutions."
"Hitherto, research on SBSE has tended to focus on the production of the fittest possible results. However, many application areas require solutions in a search space that may be subject to change. This makes robustness a natural second order property to which the research community could and should turn its attention."

Bumpy landscapes mean that, sometimes, to achieve better goals you may have to first give up some of your current achievements. In the history of A.I. this is also called Sussmann's anomaly- that sometimes the way to "better" is via "worse". There are many ways to "jump over" these anomalies. Sussman (and his supervisor, Marvin Minsky) believed that intelligence requires an exolicit list of exceptions or tricks and that any plan for making things better had better be auditted by a "debugging" system. That is a knowledge-full approach that requires some analyst to supply descriptions of "when to jump around". Alternate knowledge-less approaches are:

Stochastic jumps using, say, simulated annealing;
The retries mechanism discussed below;
Momentum constants added to the mutators of stocahstic search. Such momentum constants resist sudden stops to the search: if a local maximum is reached, the momentum constant would push the inference a little further just to see if anything better lies beyond the current position. For example, in the following landscape, momentum would nudge the ball over that little lip into a better (and lower) world.

mom

Which is best: knowledge-full or knowledge-less? Well, that depends on the nature of the problem, the intrinsic value of the knowledge, etc etc. But the general engineering trade-off is that knowledge-less approaches are faster to build and maintain, while the knowledge-full approaches perform comparatively better. FYI- I used to work on knowledge-full approaches but I have found my life to be easier since I switched to knowledge-less.

Walkting the Territory

A.k.a. multi-objective domination

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search