Exercises - Hypothesis Testing Basics

  1. When trying to provide strong evidence that a new test has a better than $50\%$ chance of detecting early onset of Alzheimer's disease, which $p$-value does one hope to see? $0.999, 0.5, 0.01, 0.0001$? Why?

    $0.0001$, as we want the $p$-value -- the probability of seeing what we saw in the sample (or something even more compelling) under the assumption of the null hypothesis (here, $p\le0.5$) -- to be as small as possible, so that we reject that null hypothesis in favor of the alternative (here, $p\ge0.5$).

  2. One wishes to better support a claim that tablets of a particular cold medicine contain 325 mg of aspirin. If one increases the number of tablets sampled and measured for their aspirin content, will subsequent hypothesis testing better support this claim?

    The null hypothesis in the situation described is $\mu = 325$. Hypothesis testing results in the rejection of the null hypothesis, or failing to reject it -- however, it never results in supporting the null hypothesis. In general, hypothesis testing can never be used to support a claim that a population parameter is equal to some particular value.

  3. It is hoped that a new computer vision system identifies the gender of its subjects correctly more often than not. In a sample of 50 subjects, 20 have their gender correctly identified. Does the sample data support what we wish to show? Would any $\widehat{p} \lt 0.5$ be able to support what we wish to show?

    The null hypothesis here is $p \le 0.5$, and the alternative is $p \gt 0.5$. Certainly no $\widehat{p} \lt 0.5$ is going to provide evidence that $p \gt 0.5$. That said, if the engineers that have designed this vision system continue to get sample proportions below $0.5$, they might consider programming their system to identify as male any subjects formerly identified as female, and vice-versa.

  4. If one sees 90 heads in 100 tosses, can one reasonably conclude that the coin is biased? Explain your answer in the context of hypothesis testing. (Use only qualitative estimates of the probabilities involved.)

    Yes. The null hypothesis is the accepted belief that coins should produce heads $50\%$ of the time (i.e., $H_0 : p = 0.5$). Under the assumption of this null hypothesis, the probability of seeing $90$ heads (or more) out of $100$ tosses is very small. (How would you find this probability?) It is therefore highly unlikely that we would see such a result the first time we flip a coin $100$ times -- and yet we did. As such, we reject the null hypothesis in favor of the alternative (i.e., the coin is biased, $p \ne 0.5$).

  5. One wishes to show that the average pulse rate for people of a given age is less than $75$, and a simple random sample of people this same age have an average pulse rate of $74.4$. Does one have significant evidence in support of what one wished to show? Explain your answer in the context of hypothesis testing. (Use only qualitative estimates of the probabilities involved.)

    Probably not. The alternative hypothesis is what we wish to show: $\mu < 75$. This makes the null hypothesis $\mu \ge 75$. The probability that one would see a pulse rate at least as compelling as what was seen (i.e., anything 74.4 or less) is probably not small given how close $74.4$ and $75$ are. Thus, we would likely fail to reject the null hypothesis, and not have significant evidence of what we wished to show. Much of this analysis, however, depends on how spread out the distribution of pulse rates might be. With the problem not reporting either $\sigma$ or $s$, it is difficult to know for sure.

  6. Express $H_0$ and $H_1$ symbolically for testing each claim below:

    1. The mean annual starting salary for computer science majors is greater than $\$70,000$.
    2. The standard deviation for human body temperatures equals $0.62^{\circ} F$.
    3. The proportion of people that suffer from diabetes in America is less than $9\%$
    4. The standard deviation of duration times (in seconds) of the Old Faithful geyser is less than 40 seconds.
    1. $H_0 : \mu \le 70,000$ and $H_1 : \mu \gt 70,000$
    2. $H_0 : \sigma = 0.62$ and $H_1 : \sigma \ne 0.62$
    3. $H_0 : p \ge 0.09$ and $H_1 : p \lt 0.09$
    4. $H_0 : \sigma \ge 40$ and $H_1 : \sigma \lt 40$

  7. Assume the normal distribution applies and find the critical $z$ value for the situation described:

    1. Two-tailed test; $\alpha = 0.01$
    2. Right-tailed test; $\alpha = 0.02$
    3. $\alpha = 0.05$; $H_1 : p \ne 98.6^{\circ} F$
    4. $\alpha = 0.005$; $H_1 : p \lt 5280 \textrm{ ft}$
    1. $z = \pm 2.5758$
    2. $z = 2.05$
    3. $z = \pm 1.96$
    4. $z = -2.5758$

  8. The test statistic for hypothesis tests involving a single proportion is given by:

    $$z = \displaystyle{\frac{\widehat{p} - p}{\sqrt{\displaystyle{\frac{pq}{n}}}}}$$

    Find the value of the test statistic for the claim that the proportion of peas with yellow pods equals $0.25$, where the sample involved includes $580$ peas with $152$ of them having yellow pods.

    $z=0.67$

  9. For each situation below, find the $p$-value using a $0.05$ level of significance, and state the conclusion (i.e., reject or fail to reject the null hypothesis):

    1. The test statistic in a left-tailed test is $z=-1.25$
    2. The test statistic in a two-tailed test is $z=1.75$
    3. With $H_1 : p \ne 0.707$, the test statistic is $z = -2.75$
    4. With $H_1 : p \gt 1/4$, the test statistic is $z = 2.30$
    1. $0.1056$; fail to reject the null hypothesis
    2. $0.0802$; fail to reject the null hypothesis
    3. $0.0060$; reject the null hypothesis
    4. $0.0107$; reject the null hypothesis

  10. State the inference for each situation

    1. In testing the claim that the proportion of blue M & M's is greater $5\%$, the null hypothesis is rejected.
    2. In testing the mean length of a pregnancy for American women taking a particular drug is no longer its expected length of $268$ days, we fail to reject the null hypothesis.
    1. There is statistically significant evidence the proportion of blue M & M's is greater than $5\%$.
    2. There is no statistically significant significant evidence the length of a pregnancy for American women taking this drug is no longer its expected length of $268$ days.