Probability Distributions

I'm writing this for those who would like to learn a little more about the Gaussian distribution. We'll learn through matlab experiments. To begin, we'll generate some Gaussian random variables. For contrast, we'll also generate uniform random variables.
>> n = 100000;
>> u = rand(1,n);
>> g = randn(1,n);
To make these collections of variables easire to deal with, let's sort them:
>> u = sort(u);
>> g = sort(g);
The probability density functions tell us what the histogram of these distributions will look like. Recall that the density of the Gaussian is
exp(-x^2 / (2*sigma^2)) / sqrt(2 * pi * sigma)
For now, we'll take sigma = 1. Also note that the density function of a variable uniformly distributed between 0 and 1 is just the constant function 1. Now, to generate the histograms, we'll divide the samples into buckets of size .1, and plot both tbe histograms and the density functions.
>> a = ceil(max(abs(g))) + .05;
>> x = [-a:.1:a];
>> figure(1);
>> clf
>> hist(g,x)
>> hold on;
The probability that a Gaussian random variable will fall in a bucket of size .1 around x is approximately .1*p(x). As there are n items, we guess that the number that should fall in that bucket is given by the formula:
>> y = (.1 * n) * exp(-x.^2/2)/sqrt(2*pi);
>> p = plot(x,y,'r');
>> set(p,'LineWidth',3)
It's not quite as exciting for the uniform variables:
>> figure(2);
>> clf
>> x = [.05:.1:.95];
>> hist(u,x)
>> hold on
>> y = .1 * n * ones(size(x));
>> p = plot(x,y,'r');
>> set(p,'LineWidth',3)
The other plot worth seeing is that of the cumulative distribution functions of these distributions. We generate:
>> figure(1);
>> clf
>> x = [1:length(g)]/length(g);
>> plot(g,x)
>> xlabel('value')
>> ylabel('probability')

>> figure(2)
>> clf
>> x = [1:length(u)]/length(u);
>> plot(u,x)
>> xlabel('value')
>> ylabel('probability')
When we see a point (v,p) on one of these curves, it means the the chance a variable drawn from the distribution is less than v is p. Looking at figure 1, we see that the curve has a point at (0,.5). This means that the chance that the Gaussian is less 0 is .5.

Gaussian Channel

To learn a little about what happens if we transmit a 1 over a Gaussian channel, let's do it many times.
>> n = 100000;
>> z = 1 + randn(1,n);
To see how many times the result was negative, we compute
>> sum(r < 0)/n

ans =

      0.15737
So, if we guessed the value of a bit by just checking if the recieved value was positive or negative, we'd get the answer wrong 15.7% of the time.
Let's now repeat the experiment with more noise, say at sigma = 2
>> sigma = 2;
>> r = 1 + sigma*randn(1,n);
>> sum(r < 0)/n

ans =

       0.3081
And, with less noise, (at sigma = 1/2),
>> sigma = 1/2;
>> r = 1 + sigma*randn(1,n);
>> sum(r < 0)/n

ans =

      0.02377

Dan Spielman
Last modified: Tue Sep 17 13:21:13 EDT 2002