How to Solve Optimisation and Search Problems with Genetic Algorithm

Discover the basics of genetic algorithms, how it works and the key elements to define to model a problem you want to solve

Sébastien Combéfis
5 min readJul 23, 2021
Photo by Prachatai on Flickr (October 16, 2017).

Genetic algorithms is one of the main types of evolutionary algorithms, a set of techniques inspired by the natural selection process from biology. These latter are population-based techniques, evolving a collection of individuals who are solution candidates to the problem to be solved [1].

Recombination of individuals, mixing the characteristics conveyed by the population is the main way to improve the solution candidates. Evolutionary algorithms are highly stochastic, leading to results involving randomness and with a certain uncertainty. Genetic algorithm is the particular case in which the individuals are represented as strings over a finite alphabet [2].

Reproduction cycle

Any evolutionary algorithm, and therefore genetic algorithms, follows the same general scheme which proceeds as follows:

  1. Generation of an initial population with random individuals.
  2. Evaluation of the fitness of each individual of the population.
  3. Repetition of re-generational steps until some condition is satisfied.

The fitness of an individual represents its quality, measuring whether it is a good solution candidate. The third step is in fact a loop that results in successive evolutions among the individuals of the population, trying to improve their fitnesses. It typically consists of four successive steps:

  1. Selection of best-fit individuals (parents) for reproduction.
  2. Breeding of new individuals (children) with crossover and mutations.
  3. Evaluation of the fitness of the new individuals (modified children).
  4. Replacement of least-fit individuals with new ones.

This reproduction cycle is repeated until a given condition is reached. For example, they can be executed to result in a given number of generations. Another possibility is to achieve a certain minimum level of quality, measured with the average fitness on the best individuals, for example.

The reproduction cycle starts with an initial population of randomly generated individuals who are then subject to evolutions going through a cycle with reproductions, modifications and evaluations that may result in their elimination from the population if their quality is not high enough (picture by the author).

Representation of individuals

Genetic algorithms are a particular case of evolutionary algorithms in which the individuals of the population are each represented as a string over a finite alphabet. The population is, therefore, just a set of strings.

To better understand how to solve a problem using genetic algorithms, let’s examine how to solve a Sudoku. This is an example of a search problem in which digits between 1 and 9 have to be found to fill the missing squares, so that to satisfy the constraints of a Sudoku (not twice the same digit in each row, column and nine-cell block).

Picture by Tim Stellmach on Wikipedia (April 8, 2017).

For this problem, a solution candidate can be represented as a sequence of 51 digits between 1 and 9, corresponding to the digits to fill in the Sudoku from top to bottom and left to right. Therefore, the population consists in a set of such 51-digit sequences, namely the individuals.

Fitness function

It is important to be able to evaluate the quality of the individuals, who represent the solution candidates. Only the best individuals should indeed be kept from one generation to another. The fitness function associates a single figure of merit to each individual, the ones with the largest fitness value being the best ones.

For the Sudoku solving problem, the fitness function can be defined as 27 minus the total number of rows, columns and blocks in which there are conflicts. Given this definition, an individual who is a solution will have a fitness of 27.

Selection method

In each reproduction cycle, a selection of parents among the individuals of the population must be carried out. Several strategies can be used for this selection. Depending on the chosen strategy, the mixing of the population will be different, resulting in a final better or worst result.

For example, with the roulette wheel selection method, individuals are randomly selected with a probability proportional to their relative fitness values in the population. Therefore, the best individuals have a greater chance of being selected. Another method is the elitism selection which just takes the individuals with the best fitness as parents.

Reproduction and modification

The parents who have been selected then reproduce with each other, resulting in the creation of children, that is, new individuals. Two operations typically occur during this phase of the reproduction cycle:

  1. The crossover operation recombines two individuals to create two new ones. A typical way to conduct this operation consists in taking the first part of one individual to combine it with the second part of another one, and vice-versa.
  2. The mutation operation makes a local modification to an individual. A typical way to mutate an individual consists in changing one letter of its string representation by another one from the finite alphabet.

The main goal of the crossover is to speed up the search in the early evolution phase, combining together good individuals in the hope of creating better ones. Mutations are, on their side, used to restore information lost from the previous generations in the population.

In the case of our Sudoku solving problem, the crossover operation can select the n first digits of one of the parents followed by the 51n digits of the other parent to create a child. The mutation operation can just change one of the 51 digits by another one.

Termination

The reproduction cycle continues until a given termination condition is reached. In the case of a search problem, the fitness function can generally measure whether the solution has been reached. In this case, the algorithm typically continues until one of the individuals has the best achievable fitness value. If this takes too much time, or in the case of an optimisation problem, the algorithm can be stopped earlier. In this case, the obtained solution will typically only be an approximate solution whose quality can be evaluated thanks to its fitness value.

To conclude, this article introduces the basics of genetic algorithms and how it works. As you may have noticed, many elements must be defined when modelling a given problem: how the individuals are represented, a fitness function to evaluate their quality, the method to select the parents for the reproduction, the crossover and mutation operations to create the children and the termination condition.

The choices made for the modelling have an influence on whether the algorithm will or will not find a solution or a good enough one. It also has an influence on the running time necessary to obtain a solution. Also, due to the stochastic property of genetic algorithms, there are no guarantees that they converge to a solution within a reasonable time or at all. The principles of genetic algorithms is quite simple, but the modelling activity is difficult and requires human expertise.

References

[1] David W. Corne and Michael A. Lones (2018). Evolutionary Algorithms, arXiv preprint arXiv:1805.11014.
[2] S.N. N. Sivanandam and S. N. Deepa (2010). Introduction to Genetic Algorithms, Springer, ISBN: 978–3–642–09224–4.

--

--

Sébastien Combéfis

Currently working as a business process analyst, also teaching computer science in higher education and developing applications in my spare time.