8.3 Hypothesis Testing
Idea: have some suspicion as to value of (unknown) population
parameter; use a sample to test if hypothesis is true
Let p = the fraction of students at Allentown College who are
I think fewer than 40% of students are registered Democratic (p <
Setup: use two complementary hypotheses:
H0 = null hypothesis; what we suspect isnít true
H1 = alternate hypothesis; what we suspect is
In example above, we'd use
null hypothesis H0 : p >= .40
alternate hypothesis H1 : p < .40
Approach: play devilís advocate:
In hypothesis testing, choose a set of outcomes for the test statistic
that will be used to reject the null hypothesis; called the critical
region or rejection region
assume null hypothesis is true
take a sample, and compute a test statistic from the sample whose value
will (hopefully) refute the assumption that the null hypothesis is true
and allow us to reject it
Assume the null hypothesis, that p >= .40, and sample 20 students.
Let X = number in sample registered Democratic.
The value of .0510 is called the significance level a
of the test
What outcomes would suggest that the assumption p >= .40 is false?
Well, if p >= .40, we expect 8 or more to be registered Democratic;
thus if our sample has fewer than 8 Democrats, the assuption that 40% or
more of the students are registered Democratic would seem incorrect.
However: even if 40% of the students are Democrats, we
certainly won't always get exactly 8 Democrats in every sample of 20 students;
we'd expect the number to vary somewhat from sample to sample. For example,
it wouldn't be all that unlikely that just due to random chance, we'd get
a sample of 20 students in which only 7 are registered Democratic; thus
this result wouldn't give strong evidence that there must be fewer
than 40% Democrats in the student population.
Thus we really only get strong evidence that we should reject the null
hypothesis if the number of Democrats in the sample is far less
than the expected number of 8.
We'll use as our rejection region X <= 4;
if p >= .40, it is unlikely weíd get a sample with 4 or fewer Democrats
In fact, we can quantify just how unlikely this is using probabilities:
If the null hypothesis is true (p >= .40), whatís the probability that
in a sample of size 20 we would have X <= 4?
Suppose p = .40; then the distribution of X will be the binomial distribution
with n = 20, p = .40
Then the probability that X <= 4 is .0510,
from the table of cumulative probabilities for the binomial distribution.
If p > .40, thereís even a smaller chance that X <= 4.
Thus we conclude that if the null hypothesis is true, there would be
only a 5% chance we'd get a sample with 4 or fewer Democrats in it due
just to sampling variation. While this could happen, it's quite
unlikely, and thus it seems more likely that the null hypothesis is false,
and that there are in fact fewer than 40% Democrats in the student population
itís the probability that the test statistic will fall into the rejection
region when the null hypothesis is true (causing us to erroneously
reject the null hypothesis)
usually choose a desired value for a, and then
find the rejection region corresponding to this.
If we want the significance level a of the
test of Democratic registration to be .01, what should the rejection region
Well, assuming the null hypothesis is true, that p = .40 or greater,
we can see from the table for the binomial distribution with n = 20 and
p = .40 that
P(X <= 3) = .0160 and P(X <=
2) = .0036.
Thus the rejection region X <= 3 doesn't quite
give us the significance level desired; there's a 1.6% chance we'll erroneously
reject the null hypothesis when it is in fact true. The rejection region
X <= 2 is a little too strong, since using it there would
only be a .3% chance of erroneously rejecting the null hypothesis. We'd
have to decide which of the two regions to use (depending on whether a
1.6% chance of erroneously rejecting the null hypothesis is low enough,
or if we really want to have this probability below 1%.)
There are 4 possible outcomes of a hypothesis test:
Null hypothesis is true, and we erroneously reject it; called a Type
Null hypothesis is false, and we correctly reject it
Null hypothesis is true, and we correctly refuse to reject it
Null hypothesis is false, and we incorrectly refuse to reject it; called
a Type II error
Type I error
Type II error
for a hypothesis test, we specify the desired level of significance a,
and use this to determine the rejection region; this specifies the probability
of making a type I error.
this is usually the worse error to make: thus we want to limit the
chance that we'll make it
Surgical study: determine if a new surgical technique increases the
lifespan of patients suffering from a particular condition. In this case
the null and alternate hypotheses would be as follows:
H0 : surgery has no effect on longetivity
Because of the dangers associated with surgery, we would definitely not
want to erroneously conclude that the surgery is beneficial if it in fact
offers no real benefit to the patients. Thus we'd want to choose our rejection
region to be such that there would be only a small chance that sampling
variation would give us a sample showing improvement if there is in fact
no improvement in the population at large.
H1 : surgery increases longetivity
Previous section Next