# Probability Distributions

I'm writing this for those who would like to learn a little
more about the Gaussian distribution.
We'll learn through matlab experiments.
To begin, we'll generate some Gaussian random variables.
For contrast, we'll also generate uniform random variables.
>> n = 100000;
>> u = rand(1,n);
>> g = randn(1,n);

To make these collections of variables easire to deal with,
let's sort them:
>> u = sort(u);
>> g = sort(g);

The probability density functions tell us what the histogram
of these distributions will look like.
Recall that the density of the Gaussian is
exp(-x^2 / (2*sigma^2)) / sqrt(2 * pi * sigma)

For now, we'll take *sigma = 1*.
Also note that the density function of a variable
uniformly distributed between 0 and 1 is just
the constant function 1.
Now, to generate the histograms, we'll divide the samples
into buckets of size .1, and plot both tbe histograms
and the density functions.
>> a = ceil(max(abs(g))) + .05;
>> x = [-a:.1:a];
>> figure(1);
>> clf
>> hist(g,x)
>> hold on;

The probability that a Gaussian random variable will fall in
a bucket of size .1 around x is approximately .1*p(x).
As there are n items, we guess that the number that should fall
in that bucket is given by the formula:
>> y = (.1 * n) * exp(-x.^2/2)/sqrt(2*pi);
>> p = plot(x,y,'r');
>> set(p,'LineWidth',3)

It's not quite as exciting for the uniform variables:
>> figure(2);
>> clf
>> x = [.05:.1:.95];
>> hist(u,x)
>> hold on
>> y = .1 * n * ones(size(x));
>> p = plot(x,y,'r');
>> set(p,'LineWidth',3)

The other plot worth seeing is that of the cumulative distribution
functions of these distributions.
We generate:
>> figure(1);
>> clf
>> x = [1:length(g)]/length(g);
>> plot(g,x)
>> xlabel('value')
>> ylabel('probability')
>> figure(2)
>> clf
>> x = [1:length(u)]/length(u);
>> plot(u,x)
>> xlabel('value')
>> ylabel('probability')

When we see a point *(v,p)* on one of these curves,
it means the the chance a variable drawn from the distribution
is less than *v* is *p*.
Looking at figure 1, we see that the curve has a point
at (0,.5). This means that the chance that the Gaussian
is less 0 is .5.

## Gaussian Channel

To learn a little about what happens if we transmit a
1 over a Gaussian channel, let's do it many times.
>> n = 100000;
>> z = 1 + randn(1,n);

To see how many times the result was negative,
we compute
>> sum(r < 0)/n
ans =
0.15737

So, if we guessed the value of a bit by just checking if the
recieved value was positive or negative, we'd get the
answer wrong 15.7% of the time.

Let's now repeat the experiment with more noise, say
at *sigma = 2*
>> sigma = 2;
>> r = 1 + sigma*randn(1,n);
>> sum(r < 0)/n
ans =
0.3081

And, with less noise, (at *sigma = 1/2*),
>> sigma = 1/2;
>> r = 1 + sigma*randn(1,n);
>> sum(r < 0)/n
ans =
0.02377

Dan Spielman
Last modified: Tue Sep 17 13:21:13 EDT 2002