Non-uniform random number generators
by Agner Fog, 2001 - 2007
The file stocc.zip contains a C++ class library of non-uniform random number generators
providing variates with the following distributions: normal,
bernoulli, poisson, binomial, hypergeometric, Wallenius' noncentral hypergeometric,
and Fisher's noncentral hypergeometric distribution. Most distributions are
available in both univariate and multivariate versions. There is also a function to
shuffle numbers.
Most of these functions are fast and accurate, even for extreme values of the
parameters.
The functions are based on the uniform random number generators in randomc.zip
or asmlib.zip.
File list
The archive stocc.zip contains the following
files:
- stocc.htm
- This file. Instructions.
- evolc.htm
- Explanation of examples of simulation of evolution.
- stocc.h
- C++ header file containing class definitions.
You must #include
this in all C++ files that use this library.
- stoc1.cpp
- C++ source code for the non-uniform random number generators for the most
common probability distributions.
- stoc2.cpp
- Alternative code for the distributions poisson, binomial and hypergeometric.
Optimized for situations where these functions are called repeatedly with
the same parameters.
- stoc3.cpp
- Source code for generating random variables with the univariate and
multivariate Wallenius' and Fisher's noncentral hypergeometric distributions.
- fnchyppr.cpp, wnchyppr.cpp, erfres.cpp
- Additional source code used by stoc3.cpp.
- erfresmk.cpp
- A program that was used for making the tables in erfres.cpp.
- ex-stoc.cpp
- Example showing how to use this class library to make random numbers with
various distributions.
- ex-cards.cpp
- Example showing how to shuffle a list of items randomly. Shuffles a deck
of cards.
- ex-lotto.cpp
- Example showing how to generate a sequence of random integers where no
number occurs more than once.
- ex-evol1.cpp, ex-evol2.cpp
- Examples showing simulation of biological evolution.
- testpois.cpp, testbino.cpp, testhype.cpp, testfnch.cpp, testwnch.cpp,
testmfnc.cpp, testmwnc.cpp
- Test programs for testing the poisson, binomial, hypergeometric, and
univariate and multivariate Fisher's and Wallenius' noncentral
hypergeometric distribution generators. Also useful for calculating mean and variance of
these distributions.
- distrib.pdf
- Definition of the statistical distributions that can be generated with
this package.
- sampmet.pdf
- Sampling methods. Theoretical explanation of the methods used for
generating variates with these distributions.
- licence.htm
- Copyright and licence conditions.
Instructions
Unpack the files randomc.zip and stocc.zip
into a new directory. Make sure all the unpacked files are in the same
directory. To get started, just compile one of the example programs for console
mode using any C++ compiler.
The C++ class StochasticLib
can be derived from any of the
random number generator classes in randomc.zip or asmlib.zip.
Choose
the one you like best. The examples use the "Mersenne Twister" class TRandomMersenne
. To use a different random number generator as base class, just change
the
#define RANDOM_GENERATOR TRandomMersenne
directive in the file stocc.h or
make such a define before the
#include "stocc.h"
statement in all your
modules.
The C++ class StochasticLib2
is derived from StochasticLib
.
It contains alternative versions of the functions poisson, binomial and
hypergeometric. StochasticLib
is the best choice if the
parameters to these functions often wary in your program, StochasticLib2
is faster when these functions are called many times with the same parameters.
The C++ class StochasticLib3
is derived from StochasticLib
or StochasticLib2
. It adds the Fisher's and Wallenius' noncentral
hypergeometric distributions, and the multivariate versions of these, to the
library.
The header file stocc.h must be included in all modules that use this class
library. The source code file stoc1.cpp can be included in your project, either
as an #include
, or as a separate module. If you are using StochasticLib2
,
then you need both stoc1.cpp and stoc2.cpp. If you are using StochasticLib3
,
then you need both stoc1.cpp and stoc3.cpp.
A single-threaded application should make only one instance of StochasticLib
or StochasticLib2
or StochasticLib3
. These stochastic library classes
inherit the uniform random number functions from their base class. Therefore,
one object of StochasticLib3
gives you access to all the uniform and non-uniform random
number generators.
Example:
#include "stocc.h" // header file for class library
#include <time.h> // define time function
void main () {
int seed = time(0); // seed for random number generator
StochasticLib sto(seed); // make instance of class
int i = sto.IRandom(0, 99); // make random integer with uniform distribution
int b = sto.Binomial(0.2, 100); // make random integer with binomial distribution
...
}
See the file ex-stoc.cpp for a more detailed example.
Portability
The C++ class library is supposed to work with all C++ compilers and all
operating systems. It has been tested on several different systems.
There are, however, a few system differences that you need to be aware of:
- Error messages. There is no portable way of writing error messages.
Systems with a graphical user interface (e.g. Windows) need a pop-up message
box, while console mode programs and other line oriented systems need output to the standard output or error
output. Therefore, you may have to modify the function
FatalError
in the file userintf.cpp to fit your system. This function is called by the library functions in case of illegal parameter
values or other fatal errors. Experience shows that these
error messages are very useful when debugging a program that uses the
non-uniform random variate generators. You may even enhance the FatalError
function to output additional debug information about the state of your
program.
- Program exit. Windows-like environments may require that the program waits
for the user to press a key before exiting, in order to prevent the output
screen image from disappearing. Therefore, you may have to modify the function
EndOfProgram
in userintf.cpp to fit your system if you experience this problem.
- Multithreaded programs. Use one instance of the random number generator
class for each thread. Make sure each instance is initialized with a
different seed. You may, for example, add 1 to the seed for each instance.
- See randomc.htm for more portability issues.
Function descriptions
- int Bernoulli(double p)
- Bernoulli distribution with probability parameter p.
Returns 1 with probability p, or 0 with probability 1-p.
Error conditions:
Gives error message if p < 0 or p > 1.
- double Normal(double m, double s)
- Normal distribution with mean m and standard deviation s.
This distribution simulates the sum of many random factors.
Definition:
f(x) = (2*pi*s2)-0.5*exp(-0.5*s-2*(x-m)2)
Error conditions:
None.
- long int Poisson (double L)
- Poisson distribution with mean L.
This is the distribution of the number of events in a given time span or a
given geographical area when these events are randomly scattered in time or
space.
Definition:
f(x) = Lx/x! * exp(-L)
Error conditions:
Gives error message if L < 0 or L > 2*109.
- long int Binomial (long int n, double p)
- Binomial distribution with parameters n and p.
This is the distribution of the number of red balls you get when drawing n
balls with replacement from an urn where p is the fraction of red
balls in the urn.
Definition:
f(x) = B(n,x) * px * (1-p)n-x,
where the binomial coefficient B(a,b) = a! / (b! * (a-b)!).
Error conditions:
Gives error message if n < 0 or p < 0 or p > 1.
- long int Hypergeometric (long int n, long int m, long int N)
- Hypergeometric distribution with parameters n, m, N. (Note the order of
the parameters).
This is the distribution of the number of red balls you get when drawing n
balls without replacement from an urn where the urn contains N balls,
where m balls are red and N-m balls are white.
Definition:
f(x) = B(m,x) * B(N-m,n-x) / B(N,n),
where the binomial coefficient B(a,b) = a! / (b! * (a-b)!).
Error conditions:
Gives error message if any parameter is negative or n > N or m > N.
- long int WalleniusNCHyp(long int n, long int m, long int N, double
odds)
- The Wallenius noncentral hypergeometric distribution is the same as the hypergeometric distribution, but with bias. The bias can be seen
as an odds ratio. A bias > 1 will favor the red balls, a bias < 1 will
favor the white balls.
When bias = 1 we have the hypergeometric distribution.
Error conditions:
Gives error message if any parameter is negative or n > N or m > N.
- long int FishersNCHyp(long int n, long int m, long int N, double
odds)
- The Fisher's noncentral hypergeometric distribution is a conditional binomial
distribution resembling the Wallenius noncentral hypergeometric distribution. See the
file distrib.pdf for a definition. Execution may
be slow and inexact when N is high and bias is far from 1.
Error conditions:
Gives error message if any parameter is negative or n > N or m > N.
- void Multinomial (long int * destination, double * source, long int n, int colors)
void Multinomial (long int * destination, long int * source, long int n, int colors)
- Multivariate binomial distribution.
This is the distribution you get when drawing n balls from an urn with
replacement, where there can be any number of colors. This is the same
as the binomial distribution when colors
= 2.
The number of balls of each color is returned in destination
,
which must be an array with colors
places. source
contains the number or fraction of balls of each color in the urn. source
must be a double
or long int
array with colors
places.
The sum of the values in source
does not have to be 1, but it
must be positive. The probability that a ball has color i
is source[i]
divided by the sum of all values in source
.
Error conditions:
Gives an error message if any parameter is negative or if the sum of the
values in source
is zero. The behavior is unpredictable if source
or destination
has less than colors
places.
- void MultiHypergeo (long int * destination, long int * source, long int n, int colors)
- Multivariate hypergeometric distribution.
This is the distribution you get when drawing n balls from an urn without
replacement, where there can be any number of colors. This is the same
as the hypergeometric distribution when colors
= 2.
The number of balls of each color is returned in destination
,
which must be an array with colors
places. source
contains the number of balls of each color in the urn. source
must be an array with colors
places.
Error conditions:
Gives an error message if any parameter is negative or if the sum of the
values in source
is less than n. The behavior is unpredictable
if source
or destination
has less than colors
places.
- void MultiWalleniusNCHyp(long int * destination, long int * source, double * weights, long int n, int
colors)
- Multivariate Wallenius noncentral hypergeometric distribution. This is the
distribution you get when drawing colored balls from un urn without
replacement, with bias.
weights
is an array with colors
places containing the odds for each
color. The probability of drawing a particular ball is proportional to its
weight. The other parameters are the same as above.
This function may be inexact, but uses an approximation with an accuracy that
is better than 1% in almost all cases.
Error conditions:
Gives an error message if any parameter is negative or if the total number
of balls with nonzero weight is less than n. The behavior is unpredictable
if any of the arrays has less than colors
places.
- void MultiFishersNCHyp(long int * destination, long int * source, double * weights, long int n, int
colors)
- The multivariate Fisher's noncentral hypergeometric distribution is a conditional
binomial distribution resembling the multivariate Wallenius noncentral hypergeometric
distribution. See the file distrib.pdf for a
definition.
This function may be inexact, but uses an approximation with an accuracy that
is better than 1% in most cases. The precision can be tuned at the expense
of higher calculation times.
Error conditions:
Gives an error message if any parameter is negative or if the total number
of balls with nonzero weight is less than n. The behavior is unpredictable
if any of the arrays has less than colors
places.
- void Shuffle(int * list, int min, int n)
- This function makes a list of the n numbers from
min
to min+n-1
in random order.
The result is returned in list
, which must be an array with n elements.
The array index goes from 0
to n-1
.
If you want to shuffle something else than integers then use the integers in
list
as an index into a table of the items you want to shuffle.
Error conditions: none. The behavior is unpredictable if the size of the
array list
is less than n
.
- static double LnFac(long int n)
- Log factorial. Mainly used internally.
Definition:
LnFac(n) = loge(n!).
This function uses a table when n < 1024 and Stirling's approximation for
larger n. The table is generated by the constructor.
Error conditions:
Gives an error message if n < 0.
Theory
These distributions are defined in the file distrib.pdf.
The methods used for generating variates with these distributions are described
in the file sampmet.pdf.
A theoretical description of the univariate and multivariate Wallenius and
Fisher's noncentral hypergeometric distributions, including calculation methods and
sampling methods, is given in nchyp.pdf.
Examples showing how to simulate biological evolution using this class
library can be found in evolc.zip.
Copyright and licence
© by Agner Fog.
General public license statement.
Random number
generators page by Agner Fog