7. Parallel ECF

To use parallelization, the framework must be compiled with MPI library enabled (see installation instructions).

When using parallel version of ECF, some additional components are available which are described below. To run parallel ECF in more processes, you need to start it with the installed MPI implementation.
Running ECF in parallel is possible regardless of the choice of genotypes (individual structure) or, for some parallelization methods, of the algorithm used.

Parallel algorithms

Parallel algorithms operate on a single deme and may be run in more than one process (see below for multiple demes parallel models). Currently supported parallel algorithms are:

<Algorithm>     
	<AlgSGenGPEA> <!-- synchronous generational global parallel EA -->
		<Entry key="crxprob">0.5</Entry> <!-- crossover rate -->
		<Entry key="selpressure">10</Entry> <!-- selection pressure: how much the best individual is 'better' than the worst -->
		<Entry key="jobsize">5</Entry> <!-- number of individuals sent for evaluation to a worker process -->
	</AlgSGenGPEA>

	<AlgAEliGPEA> <!-- asynchronous steady-state global parallel EA -->
		<Entry key="tsize">3</Entry> <!-- tournament size -->
		<Entry key="jobsize">5</Entry> <!-- number of individuals sent for evaluation to a worker process -->
	</AlgAEliGPEA>
 </Algorithm>

Implicit parallelization

NOTE: this section is still under development.

Implicit parallelization uses a sequential algorithm, but the user may state what parts of the algorithm should be executed in parallel. Currently these parts are parallelizable:

evaluation: individuals to be evaluated are sent to worker processes and fitness is returned to the master
mutation: individuals to be mutated are mutated and evaluated at worker processes and then returned to the master

The only thing needed to use implicit paralleization is setting the following Registry options:

<Entry key="parallel.type">eval</Entry> <!-- implicit parallelization method: eval - evaluation, mut - mutation + eval -->
<Entry key="parallel.jobsize">10</Entry> <!-- implicit parallelization jobsize (individuals per job) -->

This kind of parallelization is asynchronous because the algorithm may use (incomplete) individuals that are still not returned from the workers.
ECF also offers the synchronous implicit parallelelization: in this mode the affected individuals are temporarily removed from the population, until they are returned to the master. Synchronicity is controlled with the following parameter in the Registry:

<Entry key="parallel.sync">1</Entry> <!-- implicit parallelization synchronicity: 0 - async, 1 - sync (default: 0) -->

Of course, although a sequential algorithm is used, these options are valid only with parallel ECF version.

Multiple deme population

The above algorithms distribute only the fitness evaluation between different processes, and are therefore most suitable for problems with complex fitness evaluation. If this is not the case, then distributing the population in several demes (subpopulations, islands) usually yields better results in terms of speedup and convergence.

To use multiple deme population in ECF, you need to set the parameter "population.demes" in the Registry block to more than 1 (which is the default value). For example:

<Entry key="population.size">50</Entry> <!-- number of individuals (default: 100) -->
<Entry key="population.demes">4</Entry> <!-- number of demes (default: 1) -->

In the above setting, the population consists of 4 demes with each deme containing 50 individuals.

Each deme currently runs the same algorithm (stated in the Algorithm XML block). Note that the demes may run the sequential as well as parallel algorithms; in the latter case, the parallel algorithm operates only within each deme (i.e. it does not 'see' other demes by default).

The migration operator

Multiple demes do not make much difference by themselves if there is no communication between the subpopulations - that's where the migration operator steps in. The migration operator can copy some individuals from a deme into another deme, depending on the user parameters. Currently supported parameters for migration are the following (stated in the Registry block):

<Entry key="migration.freq">10</Entry> <!-- individuals are exchanged each 'freq' generations (default: none) -->
<Entry key="migration.number">5</Entry> <!-- number of individuals to be sent to another deme -->

The migration operator is not active unless you set the migration.freq parameter. Additional migration properties are currently set as follows:

migration occurs each 'migration.freq' generations (for all the demes)
individuals chosen for migration: the best one and additional random ones (up to 'migration.number')
deme sending structure: ring (each deme sends its individuals to the following deme with index + 1)
individuals chosen for replacement at recepient deme: random except the best individual

Note: multiple deme population and migration operator can also be employed with the sequential ECF, where one process evolves all the demes in sequence.

The ECF parallel models

Combining the previous components, ECF offers the following parallel combinations:

single deme, sequential algorithm - sequential EA
single deme, parallel algorithm - global parallel EA
single deme, implicitly parallel algorithm - global parallel EA
multiple demes, sequential algorithm - distributed EA
- in this setting, the number of MPI processes must equal the number of demes stated in the configuration file!
multiple demes, parallel algorithm - hybrid distributed EA
- in this setting, the number of processes must be equal or greater than the number of demes, but should be greater for the parallel algorithm to make sense
multiple demes, implicitly parallel algorithm - hybrid distributed EA

The choice of the algorithm and migration parameters will depend mostly on the problem at hand and parallel environment (the machines) the algorithm is run on.