**Question:**

##### Write a function to calculate N(x) the cumulative normal distribution with mean zero and variance one given by the formula

Again, first we think about the stages of our program. We must choose a method to implement
the integration, test and verify it. Once it is then applied to the problem we need to decide how
to deal with the concept of infinity in this setting. First let use code up the integration method.
To be able to check the method, you should choose a simple function with a known
integral so that we can compare accuracy. Start with a new project and an empty cpp
file.

#include<iostream>#include<cmath>usingnamespacestd; intmain() {// do nothingreturn0; }

Compile and check it runs. We are going to implement Simpson’s method for integration firstly on the sin function so that we can check the results. The method is given by

| (1) |

Start by declaring the input parameters required by the method, including the number of
steps and the integration range, and follow with the variables used inside such as the dummy
integration variable s and the step size h. We should supply default values and initialise variables
if required

// number of stepsint N=10;// range of integrationdouble a=0,b=1.;// local variablesdouble s,h,sum=0.;// inialise the variablesh=(b-a)/N;

Next we need to add the first few terms and the last one to sum, before adding a loop to go over
the rest.

// add in the first few termssum = sum +sin(a) + 4.*sin(a+h);// and the last onesum = sum +sin(b);

Note that we should really check that N is even as this method relies on that fact. Since
we have already taken care of the terms for 0, 1 and N, the loop must run over the
terms 2 up to and including N-1. Check your loop runs over the correct values before
continuing

// loop over terms 2 up to N-1for(int i=1;i<N/2;i++) { s = a + 2*i*h;// check even termscout << 2*i << " " << s << endl; s = s + h;// check odd termscout << 2*i+1 << " " << s << endl; }

Now add them in and check against the analytical solution.

intmain() {// number of stepsint N=10;// range of integrationdouble a=0,b=1.;// local variablesdouble s,h,sum=0.;// inialise the variablesh=(b-a)/N;// add in the first few termssum = sum +sin(a) + 4.*sin(a+h);// and the last onesum = sum +sin(b);// loop over terms 2 up to N-1for(int i=1;i<N/2;i++) { s = a + 2*i*h; sum = sum + 2.*sin(s); s = s + h; sum = sum + 4.*sin(s); }// complete the integralsum = h*sum/3.;// output resultscout << sum << " " << 1.-cos(b) << endl;return0; }

You should check accuracy by varying the value of N.

Next we are going to change this algorithm to solve the normal distribution. Do this in 2
steps, first go through and change all calls to the sin function by exp(-s*s/2.), then multiply the
final result by 1/sqrt(2*pi). The result (for a=0 and b=1) should be 0.341345. Now we have to
deal with infinity, so how can we integrate over infinity? For the normal distribution
this is quite easy as it is an even function and we know that the integral between
0 and infinity is one half. So simply adding one half to the result of the integration
between 0 and b gives the value of N(b) for all b. At this stage your code should look
like

// add in the first few termssum = sum +exp(-a*a/2.) + 4.*exp(-(a+h)*(a+h)/2.);// and the last onesum = sum +exp(-b*b/2.);// loop over terms 2 up to N-1for(int i=1;i<N/2;i++) { s = a + 2*i*h; sum = sum + 2.*exp(-s*s/2.); s = s + h; sum = sum + 4.*exp(-s*s/2.); }// complete the integralsum = 0.5 + h*sum/3./sqrt(8.*atan(1.));// output resultscout << sum << endl;

Now move this algorithm into a function with the definition

doublenormalDistribution(double x)

The last thing to take account of is what happens when x is either very large and
positive or very large and negative. The best thing to do here is use analytical results
and return 0 if x is large and negative and 1 if x if large and positive. Don’t forget
to set the value of b to be x and choose a value of N sufficiently large to maintain
accuracy.

#include<iostream>#include<cmath>usingnamespacestd; doublenormalDistribution(double x) {if(x<-10.)return0.;if(x>10.)return1.;// number of stepsint N=2000;// range of integrationdouble a=0,b=x;// local variablesdouble s,h,sum=0.;// inialise the variablesh=(b-a)/N;// add in the first few termssum = sum +exp(-a*a/2.) + 4.*exp(-(a+h)*(a+h)/2.);// and the last onesum = sum +exp(-b*b/2.);// loop over terms 2 up to N-1for(int i=1;i<N/2;i++) { s = a + 2*i*h; sum = sum + 2.*exp(-s*s/2.); s = s + h; sum = sum + 4.*exp(-s*s/2.); }// complete the integralsum = 0.5 + h*sum/3./sqrt(8.*atan(1.));// return resultreturnsum; } intmain() { cout.precision(16); cout << "N(0)=" <<normalDistribution(0.) << endl; cout << "N(1)=" <<normalDistribution(1.) << endl; cout << "N(2)=" <<normalDistribution(2.) << endl;return0; }

Here we needed to choose N=2000 to get close to double precision, meaning that a single function call is incredibly expensive. There are other, more efficient approximations out there. If you are using a compiler that is math library C99 compliant you will have access to the erfc function, which will give you double precision accuracy using the formula

Unfortunately if you are using Visual Studio it is not yet implemented (might yet be in VS2013)
so we need another method. One such method is described in West (2005) and they give a VB
code implementation of the method. We are going to follow that method closely. Start with
your empty program, I have included some benchmark values in comments to check
against

#include<iostream>#include<cmath>usingnamespacestd; intmain() {// Benchmark values of N(x)// N(0) = 0.5// N(1) = 0.841344746068543// N(2) = 0.977249868051821return0; }

Compile and check the code runs with no output.

The algorithm is given by the following expressions

// calculate \sqrt{2\pi} upfront oncedouble RT2PI =sqrt(4.0*acos(0.0));// calculate 10/\sqrt{2} upfront oncedouble SPLIT = 10./sqrt(2);// enter coefficients for the polynomials P and Qdouble a[] = {220.206867912376,221.213596169931,112.079291497871,33.912866078383,6.37396220353165,0.700383064443688,3.52624965998911e-02}; double b[] = {440.413735824752,793.826512519948,637.333633378831,296.564248779674,86.7807322029461,16.064177579207,1.75566716318264,8.83883476483184e-02};// calculate the value of N at x=1double x=1.;

where we have chosen x = 1 to test. Since we know what x is we don’t need to consider the 3
cases. Now calculate the polynomial expressions using a nested multiplication (important for
accuracy) and we get

double NDash =exp(-x*x/2.0)/RT2PI; double Px = (((((a[6]*x + a[5])*x + a[4])*x + a[3])*x + a[2])*x + a[1])*x + a[0]; double Qx = ((((((b[7]*x + b[6])*x + b[5])*x + b[4])*x + b[3])*x + b[2])*x + b[1])*x + b[0]; cout << Px << " " << Qx << " " << NDash << endl;// output is 594.522 2272.83 0.241971

the output you should expect is shown. We can now estimate the value of N(x)

cout.precision(16); cout << 1 -exp(-x*x/2.0)*Px/Qx << endl;// output is 0.841344746068543

You now need to include the different cases, such as negative x, etc, and then move everything
into a function. After some testing you should arrive at an algorithm that looks something
like:

#include<iostream>#include<cmath>usingnamespacestd;/** A * * fast* and accurate polynomial approximation to $N(x)$ (double precision):* $$ N(x)= 1-\sqrt{2\pi}N'(x)\frac{P(x)}{Q(x)}, \quad 0 \leq x\leq \frac{10}{\sqrt{2}},$$* $$ N(x)= 1-N'(x)\frac{1}{F_4(x)}, \quad \frac{10}{\sqrt{2}} < x\leq 37,$$* $$ N(x)= 1, \quad x > 37,$$* $$N(x)=1-N(-x),\quad x<0.$$* Here* $$* P(x) = \sum_{i=0}^6 a_i x^i, \quad \quad Q(x) = \sum_{i=0}^7 b_i x^i* $$* and the coefficients are:* \begin{tabular}{lc||lc}* \hline* $a_0$ & 220.206867912376& $b_0$&440.413735824752 \\* $a_1$ & 221.213596169931& $b_1$&793.826512519948 \\* $a_2$ & 112.079291497871& $b_2$& 637.333633378831\\* $a_3$ & 33.912866078383& $b_3$&296.564248779674 \\* $a_4$ & 6.37396220353165& $b_4$&86.7807322029461 \\* $a_5$ & 0.700383064443688& $b_5$& 16.064177579207\\* $a_6$ & 0.0352624965998911& $b_6$&1.75566716318264 \\* & & $b_7$& 0.0883883476483184\\* \end{tabular}* For the function $F_4(x)$ we use the recurrence relation* $$* F_0(x) = x + \frac{13}{20}* $$* $$* F_{i}(x) = x + \frac{5-i}{F_{i-1}(x)} \quad \quad \text{for }i=1,2,3,4* $$** On entry x is a real value, and we return double precision estimate of the cummulative normal distribution*/doublenormalDistribution(double x) {// calculate \sqrt{2\pi} upfront oncestaticconstdouble RT2PI =sqrt(4.0*acos(0.0));// calculate 10/\sqrt{2} upfront oncestaticconstdouble SPLIT = 10./sqrt(2);staticconstdouble a[] = {220.206867912376,221.213596169931,112.079291497871,33.912866078383,6.37396220353165,0.700383064443688,3.52624965998911e-02};staticconstdouble b[] = {440.413735824752,793.826512519948,637.333633378831,296.564248779674,86.7807322029461,16.064177579207,1.75566716318264,8.83883476483184e-02};constdouble z =fabs(x);// Now N(x) = 1 - N(-x) = 1-\sqrt{2\pi}N'(x)\frac{P(x)}{Q(x)}// so N(-x) = \sqrt{2\pi}N'(x)\frac{P(x)}{Q(x)}// now let \sqrt{2\pi}N'(z)\frac{P(x)}{Q(z)} = Nz// Therefore we have// Nxm = N(x) = \sqrt{2\pi}N'(z)\frac{P(x)}{Q(z)} = Nz if x<0// Nxp = N(x) = 1 - \sqrt{2\pi}N'(z)\frac{P(x)}{Q(z)} = 1-Nz if x>=0double Nz = 0.0;// if z outside these limits then value effectively 0 or 1 for machine precisionif(z<=37.0) {// NDash = N'(z) * sqrt{2\pi}constdouble NDash =exp(-z*z/2.0)/RT2PI;if(z<SPLIT) {// here Pz = P(z) is a polynomialconstdouble Pz = (((((a[6]*z + a[5])*z + a[4])*z + a[3])*z + a[2])*z + a[1])*z + a[0];// and Qz = Q(z) is a polynomialconstdouble Qz = ((((((b[7]*z + b[6])*z + b[5])*z + b[4])*z + b[3])*z + b[2])*z + b[1])*z + b[0];// use polynomials to calculate N(z) = \sqrt{2\pi}N'(x)\frac{P(x)}{Q(x)}Nz = RT2PI*NDash*Pz/Qz; }else{// implement recurrence relation on F_4(z)constdouble F4z = z + 1.0/(z + 2.0/(z + 3.0/(z + 4.0/(z + 13.0/20.0))));// use polynomials to calculate N(z), note here that Nz = N' / FNz = NDash/F4z; } }//returnx>=0.0 ? 1-Nz : Nz; } intmain() { cout.precision(16); cout << "N(0)=" <<normalDistribution(0.) << endl; cout << "N(1)=" <<normalDistribution(1.) << endl; cout << "N(2)=" <<normalDistribution(2.) << endl;return0; }

### References

West, Graeme. Better approximations to cumulative normal functions. Wilmott Magazine, 9:70–76, 2005. URL https://lyle.smu.edu/~aleskovs/emis/sqc2/accuratecumnorm.pdf.

can u do it with monte carlo methods

ReplyDeletenice job, thanks

ReplyDelete