Probability Distributions

I'm writing this for those who would like to learn a little more about the Gaussian distribution. We'll learn through matlab experiments. To begin, we'll generate some Gaussian random variables. For contrast, we'll also generate uniform random variables.

>> n = 100000;
>> u = rand(1,n);
>> g = randn(1,n);

To make these collections of variables easire to deal with, let's sort them:

>> u = sort(u);
>> g = sort(g);

The probability density functions tell us what the histogram of these distributions will look like. Recall that the density of the Gaussian is

exp(-x^2 / (2*sigma^2)) / sqrt(2 * pi * sigma)

For now, we'll take sigma = 1. Also note that the density function of a variable uniformly distributed between 0 and 1 is just the constant function 1. Now, to generate the histograms, we'll divide the samples into buckets of size .1, and plot both tbe histograms and the density functions.

>> a = ceil(max(abs(g))) + .05;
>> x = [-a:.1:a];
>> figure(1);
>> clf
>> hist(g,x)
>> hold on;

The probability that a Gaussian random variable will fall in a bucket of size .1 around x is approximately .1*p(x). As there are n items, we guess that the number that should fall in that bucket is given by the formula:

>> y = (.1 * n) * exp(-x.^2/2)/sqrt(2*pi);
>> p = plot(x,y,'r');
>> set(p,'LineWidth',3)

It's not quite as exciting for the uniform variables:

>> figure(2);
>> clf
>> x = [.05:.1:.95];
>> hist(u,x)
>> hold on
>> y = .1 * n * ones(size(x));
>> p = plot(x,y,'r');
>> set(p,'LineWidth',3)

The other plot worth seeing is that of the cumulative distribution functions of these distributions. We generate:

>> figure(1);
>> clf
>> x = [1:length(g)]/length(g);
>> plot(g,x)
>> xlabel('value')
>> ylabel('probability')

>> figure(2)
>> clf
>> x = [1:length(u)]/length(u);
>> plot(u,x)
>> xlabel('value')
>> ylabel('probability')

When we see a point (v,p) on one of these curves, it means the the chance a variable drawn from the distribution is less than v is p. Looking at figure 1, we see that the curve has a point at (0,.5). This means that the chance that the Gaussian is less 0 is .5.

Gaussian Channel

To learn a little about what happens if we transmit a 1 over a Gaussian channel, let's do it many times.

>> n = 100000;
>> z = 1 + randn(1,n);

To see how many times the result was negative, we compute

>> sum(r < 0)/n

ans =

      0.15737

So, if we guessed the value of a bit by just checking if the recieved value was positive or negative, we'd get the answer wrong 15.7% of the time.
Let's now repeat the experiment with more noise, say at sigma = 2

>> sigma = 2;
>> r = 1 + sigma*randn(1,n);
>> sum(r < 0)/n

ans =

       0.3081

And, with less noise, (at sigma = 1/2),

>> sigma = 1/2;
>> r = 1 + sigma*randn(1,n);
>> sum(r < 0)/n

ans =

      0.02377

Dan Spielman

Last modified: Tue Sep 17 13:21:13 EDT 2002