4.6 Normal Approximation to the Binomial Distribution
distribution of random variable X which counts # of successes in n independent
trials with probability of success p on each trial. (“coin flips”)
X is a discrete random variable
density function: f(x) =
mean: m = np
variance: var(X) = np(1 - p)
Recall that in the discrete case, density function gives probability
that particular outcomes will occur:
f(x) = P(X = x). Can present density function as a table
ex: binomial distribution with n=5, p=.30;
then density function f(x) given in table below:
Can represent the table as a histogram (bar graph):
(It is customary to center the bars over the values they represent.)
Theorem Let X be binomial parameters n & p.
Then for n large, X is approximately normally distributed with mean
m = np, variance s2
probability that particular value x will occur = height of bar over this
P(X = x) =height of bar.
with widths of bars equal to 1, area of each bar = height * width = height;
P(X = x) = area of bar.
Thus can use areas to find probabilities, as with continuous random variables;
P(X = x) = P(x - .5 <= X <=
x + .5) = area of bar between x - .5 and x + .5
In example above, to compute P(2 <= X <= 3), could approach as follows
P(2 <= X <= 3) = P(1.5 <= X <=
= sum of areas of bars lying between x = 1.5 and x = 3.5
Of course, gives same result we'd get just by using density function table.
= .309 + .132 = .441
When n is large, tops of rectangles seem to form a smooth curve; if we
knew what this was, we could use it to find areas & hence probabilities
with integrals (instead of summing areas of bars).
i.e., the tops of rectangles in histogram form approximately a normal curve
w/same mean, variance
how large must n be for the approximation to be good? Approximation
good if np(1 - p) > 5.
To use this result, compute probabilities for binomial random variables
by finding the area under the appropriate normal curve.
In an experiment 80 trees are grown under stressful conditions. Suppose
the probability of any one tree surviving is .35; what’s the probability
that between 15 and 25 trees survive out of the 80?
Histogram for X:
Let X = # which survive; then X is binomial, w/ n=80, p=.35;
mean m = np = 80 (.35) = 28
Want P(15 <= X <= 25).
variance s2 = np(1 - p)
= 80(.35)(.65) = 18.2
standard deviation s = 4.3
Approach: Use a normal distribution to approximate the probability
tables don’t go up to n=80
could compute as P(15 <= X <= 25) =
f(15) + f(16) + ... + f(25), using the formula for the density
function, but this is time-consuming!
Let Y be a normal random variable, with mean m
= 28, s.d. s = 4.3. Then X and Y have
approximately the same distribution, in the sense that if we drew the histogram
corresponding to X the tops of the bars would be very closely approximated
by the density function for Y.
Flip coin 200 times; what’s probability get more than 120 heads?
Let X = # heads that occur in 200 flips; then X is binomial, with
n = 200 and p = .5,
mean = np = 200(.5) = 100, variance = np(1 - p) =
200(.5)(.5) = 50,
standard deviation = 7.1.
Want: P(X > 120)
Calculate using the normal approximation: let Y be normal, with
mean 100 and s.d. 7.1;
then P(X > 120) =
P(Y > 120.5)
Use standard normal r.v. Z to compute probability of normal
Z = ;
when Y = 120.5, Z = 2.89,
so P(Y > 120.5) = P(Z > 2.89) =
1 - P(Z <= 2.89) = 1 - .9981
Thus P(X > 120) =
.0019, i.e., there's anly about a .2% chance that we'll get more than 120
heads in 200 flips of a (fair) coin.
Previous section Next