Simulation of biological evolution
by Agner Fog
stocc.zip contains two simple C++ programs showing
examples of how to simulate biological evolution using a library of non-uniform
random number generators.
File list
The archive stocc.zip contains several files, of
which the following are relevant to simulation of evolution:
- evolc.htm
- This file. Instructions
- randomc.htm
- Description of C++ class library of uniform random number generators used
by the programs.
- stocc.htm
- Description of C++ class library of non-uniform random number generators
used by the programs.
- ex-evol1.cpp
- Simulation program where selection is based on competition for a limited
resource or differential predation.
- ex-evol2.cpp
- Simulation program where selection is based on differential fertility.
- ex-stoc.cpp
- Example showing how to use the non-uniform random number generators.
- mersenne.cpp
- A random number generators to use.
- stoc1.cpp stoc3.cpp
- Generators of non-uniform random variates with various distributions.
- wnchyppr.cpp
- Definition of Wallenius' noncentral hypergeometric distribution
- fnchyppr.cpp
- Definition of Fisher's noncentral hypergeometric distribution
- userintf.cpp
- Platform-dependent user interface functions
- randomc.h stocc.h
- C++ header files containing function prototypes and class definitions.
Instructions
Unpack the files randomc.zip and stocc.zip into a new directory. Make
sure all the source files are in the same directory.
Compile the file ex-evol1.cpp or ex-evol2.cpp in console mode using a C++
compiler.
Run the program.
Change the parameters as you like by editing the file ex-evol1.cpp or
ex-evol2.cpp at the place marked // define parameters
,
and try again.
Additional description of the uniform and non-uniform random number
generators used by these programs can be found in randomc.htm
and stocc.htm, respectively. Theoretical description of
Wallenius' noncentral hypergeometric probability distribution, used for simulating
Darwinian selection, is given at www.agner.org/random/theory.
Portability
These programs should work with all modern C++ compilers in all operating
systems.
See randomc.htm and stocc.htm
for more portability issues.
Theory
These programs show examples of how to make Monte Carlo simulation of
biological evolution.
The algorithm works on gene pools rather than on individuals. It does not
follow the fate of each individual. Instead, it calculates the probability
distribution of possible outcomes and then uses a function which returns a
random variable with the desired distribution. This simulation method is much
faster than the traditional simulation method based on individuals.
The program assumes discrete non-overlapping generations in a panmictic
(perfectly mixing) population. Inheritance is Mendelian. Selection is considered
at only one locus, which can hold the genes a or b.
Two different models with different selection mechanisms are provided here. The first
model assumes that selection is based on competition for a limited resource or
selective predation. The
second model involves selection in the form of differential breeding success.
The simulation of model 1 goes through the following steps for each
generation:
- Mutation. The number of genes of each type that mutate in one
generation follows the binomial distribution:
mutations of gene x = binomial (number of x, mutation rate).
It is assumed that the mutation rate is the same in both directions.
- Breeding. The total number of offspring follows a poisson or normal
distribution.
The number of genes of each type in the gene pool of the child generation
follows a binomial distribution, where the probability of each gene is the
frequency of this gene in the parent generation.
- Split into genotypes. Each individual has two genes at the locus in
question. This makes three possible genotypes: (a,a), (a,b) and (b,b). The
number of individuals with each genotype is calculated for a random
combination of all the a and b genes in the child gene pool. This process
requires two calls to the hypergeometric generator, one for each allele.
- Limitation. If the carrying capacity of the habitat is exceeded then some
individuals must die. The carrying capacity may fluctuate with a normal
distribution. Similar fluctuations in the population size may be caused by
predation.
The number of deaths of each genotype follows a multivariate Wallenius' noncentral hypergeometric distribution, where the
odds parameter is the reciprocal of the relative fitness of the genotypes.
- Generation shift. The child generation becomes the parent generation of
the next iteration.
Model 2 has the following steps:
- Mutation. Same as above.
- Split into genotypes. Same as above, but applied to parent generation.
- Split into males and females. The number of individuals of each genotype
is split into males and females. This is a binomial distribution with the
sex-ratio as probability parameter.
- Breeding potential. The maximum number of children that the parent
generation can produce follows a poisson or normal distribution. If this
number exceeds the carrying capacity of the habitat then it is reduced to the
carrying capacity. The carrying capacity may fluctuate with a normal
distribution.
- Differential fertility. The probability of each child genotype is
calculated as follows:
For all possible combinations of male genotype i and female genotype j, the
probability of an i x j mating is calculated from the product of the number
of i males and the number of j females. For each of these i x j combinations, the
probability of each child genotype is calculated according to Mendel's laws.
These probabilities are multiplied by the probability of an i x j mating and
by the fitness of the pair, and summated for all combinations of i and j.
The fitness of the pair can be either the sum or the product of the fitness
contribution of each parent. The number of children of each genotype actually generated is
calculated as a multivariate binomial distribution with these probabilities
as parameters.
- Generation shift. Same as above.
These programs may form the basis of your own simulation experiments. A lot
of refinements and extensions to the models are possible, and the library of
non-uniform random number generators is at your disposition for the
modeling of different selection mechanisms.
A more advanced simulation program, called Altruist, allows structured or
viscous populations, group selection, epistatis, punctuated equilibria, etc. You
can download it from the link below.
References
More details about the genetics of evolution can be found in a textbook on
population genetics.
The theory of probability distributions such as the poisson, binomial,
hypergeometric and normal distributions can be found in a textbook on statistics.
The Wallenius' noncentral hypergeometric distribution is defined at www.agner.org/random/theory.
Further information on the uniform and non-uniform random number generators used here
can be found on
www.agner.org/random/.
A more advanced simulation program using these methods is the Altruist
program, available from
www.agner.org/evolution. This
program can simulate models with structured or viscous populations with group
selection, epistasis, punctuated equilibria, etc.
Copyright and licence
© 2002-2007 by Agner Fog.
General public license statement.