C++ Coding - Random Numbers and Monte Carlo

Question Generate pseudo random numbers from the normal distribution.



In this example we use the standard in-built random number generator to generate normally distributed numbers. We shall use the Box-Muller method to transform numbers from a uniform distribution into numbers from the normal distribution. In order to get numbers from the uniform distribution, we will have to convert the set of random integers generated by the standard number generator into real numbers on the interval between 0 and 1. There are many better random number generators than the standard one not least the Mersenne Twister algorithm (download here), but for simplicity we stick with the standard one.

Start with your empty program with the appropriate libraries, in this case we also need cstdlib for random numbers, and later ctime to initialise the sequence. We also write 2 empty functions for the random number generators, which we will fill in as we go along.


#include <cstdlib>
#include <cmath>
#include <ctime>
#include <iostream>
using namespace std;
// return a uniformly distributed random number
double uniformRandom()
{
return 0.;
}
// return a normally distributed random number
double normalRandom()
{
return 0.;
}

int main()
{
return 0;
}
Now we can start with the uniformly distributed random number. The function rand() returns a random sequence of integer values between 0 and RAND_MAX a compiler specific predefined maximum number. To transform this set of integers into the numbers on the interval 0 to 1 is quite simple, simple cast them into doubles and divide by the maximum number. The following function generates numbers on the closed interval

double uniformRandom()
{
return (double)(rand())/(double)(RAND_MAX);
}
This generator can cause problems if 0 or 1 result in invalid numbers in later use cases. For example, in the Box-Muller method we are required to calculate the logarithm of a number from the uniform distribution, and you can't take a log of 0. So we may wish to generate numbers on the open interval. By simply adding 1 or 2 to rand() or RAND_MAX we can shift onto the open/closed, the closed/open, or the open/open interval. We use the open/closed interval here and the code is as follows

#include <cstdlib>
#include <cmath>
#include <ctime>
#include <iostream>
using namespace std;
// return a uniformly distributed random number
double uniformRandom()
{
return ( (double)(rand()) + 1. )/( (double)(RAND_MAX) + 1. );
}
// return a normally distributed random number
double normalRandom()
{
return 0.;
}

int main()
{
for(int i=0;i<100;i++)
cout << " u_i " << uniformRandom() << endl;
return 0;
}
The for loop in the main will print out 100 numbers from the sequence. You should note that every time this code is run the numbers generated are not truly random and will be exactly the same every time. As a consequence if you want the program to run on a different set of numbers each time the program start you can use srand() to seed the generator. For instance we could seed to algorithm with the number 10 with the following

srand(10);
for(int i=0;i<100;i++)
cout << " u_i " << uniformRandom() << endl;
return 0;
Given this seed, a new set of numbers are generated. We can change that number to generate different sets, but the call to seed the algorithm should be done only once otherwise you will get the same numbers over and over. This doesn't help us get a random start though, but we can use the current system time to give us a sort of random start point. The call time(NULL) will return the current system time in seconds, and this can be used effectively as your random seed. The code

srand(time(NULL));
for(int i=0;i<100;i++)
cout << " u_i " << uniformRandom() << endl;
return 0;
will generate apparently random numbers each time it is run.

Before moving on we should really perform a test to see if the set of numbers we are generating really are from the uniform distribution. For a simple test we might look at the probability that a number u from the distribution is less than say 0.25. We know that this probability must be equal to 0.25, so we can use Monte-Carlo type integration to test this out. Note that the following is true

so we can calculate the probability by taking the average of the indicator function on u less than 0.25. The code for this might look something like

// store the number of simulations
int N=100;
double prob=0.,x=0.25;
for(int i=0;i<N;i++)
{
double u=uniformRandom();
if(u<x)
prob = prob + 1.;
}
// output the probability
cout << " P(u<0.25) = " << prob/N << endl;
return 0;
Running this bit of code with larger and larger N should see the value get closer and closer to 0.25.

Once we are satisfied with the uniform random generator we can move onto the normal random generator. The Box-Muller method can be written as

where x1 and x2 will be independent normally distributed numbers. For a simplicity (although inefficient) we shall throw one of those numbers away and just use one of them. In code we write

#include <cstdlib>
#include <cmath>
#include <ctime>
#include <iostream>
using namespace std;
// return a uniformly distributed random number
double uniformRandom()
{
return ( (double)(rand()) + 1. )/( (double)(RAND_MAX) + 1. );
}
// return a normally distributed random number
double normalRandom()
{
double u1=uniformRandom();
double u2=uniformRandom();
return cos(8.*atan(1.)*u2)*sqrt(-2.*log(u1));
}

int main()
{
for(int i=0;i<100;i++)
cout << " x_i = " << normalRandom() << endl;
return 0;
}
Hopefully you should get what look like normally distributed numbers. Next, again we should check that the distribution of numbers we get has the correct mathematical properties. For this you could run a number of statistical tests. A simple check is just for the mean and variance which we can calculate as follows
 
int N=1000;
double mean=0.,variance=0.;
for(int i=0;i<N;i++)
{
double z=normalRandom();
mean = mean + z;
}
mean = mean/N;
cout << " mean = " << mean << endl;

for(int i=0;i<N;i++)
{
double z=normalRandom();
variance = variance + (mean - z)*(mean - z);
}
variance = variance/N;
cout << " variance = " << variance << endl;
Hopefully as you set the value of N larger and larger the mean and variance will tend to 0 and 1.

Comments

Popular posts from this blog

Exclusive Interview: 2011 8TV Hot Chef (美食型男) Jason Cha

Career Lessons from Breaking Bad

Things We Now Know