Exercises - Hypothesis Tests for Proportions (Two Samples)

  1. In testing the null hypothesis $p_1=p_2$, the confidence interval is $-.12 \lt p_1-p_2 \lt -.04$. Should we reject the null hypothesis? Explain.

    The null hypothesis can be written $p_1-p_2=0$. Reject the null hypothesis because 0 is not in the confidence interval for $p_1-p_2$.

  2. In testing the null hypothesis $p_1=p_2$, the confidence interval is $-.03 \lt p_1-p_2 \lt .07$. Should we reject the null hypothesis? Explain.

    Fail to reject the null hypothesis because 0 is in the confidence interval.

  3. A confidence interval for a difference in proportions is $-.02 < p_M-p_W<.08$. The sample proportion for men is $\widehat{p}_M = .34$. Find the maximum error of the estimate ($E$) and the sample proportion for women ($\widehat{p}_W$).

    $E=.05$. The point estimate $\widehat{p}_M-\widehat{p}_W=.03$, so $\widehat{p}_W=\widehat{p}_M-.03=.31$.

  4. A researcher wants to examine the possibility that women are more likely to commit suicide off a cliff than men are. Data from the past 5 years show that out of 987 men who committed suicide, 394 jumped off a cliff. On the other hand, $44\%$ of the 500 women who committed suicide did so by jumping off a cliff.

    What can be said about the researcher's claim? Use the $P$-value method with a significance level of $\alpha=0.05$.

    $n_m= 987 \qquad\widehat{p}_m=394/987=.3992 \qquad \widehat{q}_m=.6008$ $n_w= 500 \qquad \widehat{p}_w=220/500=.44 \qquad\widehat{q}_w=.56$ $$n_m\widehat{p}_m= 394\geq 5\qquad n_m\widehat{q}_m= 593\geq 5 \qquad n_w\widehat{p}_w=220\geq5 \qquad n_w\widehat{q}_w=280\geq 5$$ $H_0:p_m=p_w$; $H_1: p_m \lt p_w$

    $\bar p = \displaystyle{394+220 \over987+500}=.413$

    $ z=\displaystyle {(.3992 - .44) - 0 \over \sqrt{\displaystyle .413 \cdot .587\left({1 \over 987} + {1 \over 500}\right)}}\approx -1.51$

    $P$-value $= 0.0655$

    Fail to reject $H_0$, because $\alpha \lt P$-Value

    There is not enough evidence to support the claim that women are more likely to commit suicide off a cliff than men are.
    R:
    prop.test(c(394,220),n=c(987,500),
                         alternative="less",
                         conf.level=0.95,
                         correct=FALSE)$p.value
    [1] 0.06552033
    

  5. A survey asks a random sample of men and women whether they agree with a particular statement. The result of the survey is that $35\%$ of the men and $43\%$ of the women agree with the statement. The maximum error of the estimate for the difference in proportions is calculated to be 0.135.

    1. Find the confidence interval for the difference in the proportions of men and women who agree.
    2. Should we reject the null hypothesis $p_M=p_W$? Explain.
    1. $-.215 \lt p_M-p_W \lt .055$
    2. Fail to reject the null hypothesis because 0 is in the confidence interval.

  6. Two dice are rolled 100 times each. Die A lands on six 20 times and Die B on six 10 times. Test the claim that Die A lands on six more often than Die B. Use the critical value and the $P$-value method with $\alpha=.01$.

    Assumptions: $20,\ 80,\ 10,\ 90\geq 5$

    $H_0: p_A=p_B \qquad H_1: p_A>p_B$

    Test statistic: $z=1.98$
    Critical value: $z=2.33$

    $P$-value $=.0239$

    Conclusion: Fail to reject $H_0$, because the test statistic is not in the rejection region and the $P$-value $>\alpha$.

    Inference: There is not enough evidence to support the claim that Die A lands on six more often than Die B.
    R:
    prop.test(c(20,10),n=c(100,100),
                       alternative="greater",
                       conf.level=0.99,
                       correct=FALSE)$p.value
    [1] 0.02383519
    

  7. A survey of 430 randomly chosen adults found that $21\%$ of the 222 men and $18\%$ of the 208 women had purchased books online. Is there evidence that men are more likely than women to make online purchases of books? Use the $P$-value method with significance level .05.

    Assumptions: $46.6,\ 175.4,\ 37.4,\ 170.6 \geq 5$

    $H_0: p_M=p_W \qquad H_1: p_M \gt p_W$

    Test statistic: $z=.784$
    $P$-value $=.2177$

    Conclusion: Fail to reject $H_0$ because the $P$-value $>\alpha$.

    Inference: There is not enough evidence to support the claim that men are more likely than women to make online purchases of books.
    prop.test(c(0.21*222,0.18*208),n=c(222,208),
                                   alternative="greater",
                                   conf.level=0.95,
                                   correct=FALSE)$p.value
    [1] 0.2165452
    

  8. A study investigated survival rates for in-hospital patients who suffered cardiac arrest. Among 58,593 patients who had cardiac arrest during the day, 11,604 survived and were discharged. Among 28,593 patients who suffered cardiac arrest at night, 4139 survived and were discharged. Use a 0.01 significance level to test the claim that the survival rates are the same for day and night. Use the confidence interval method.

    Assumptions: $11604,\ 46989,\ 4139,\ 24454 \geq 5$

    $H_0: p_D=p_N \qquad H_1: p_D\not=p_N$

    $E=.0068$; Confidence interval: $.046 \lt p_D-p_N \lt .060$

    Conclusion: Reject $H_0$ because 0 is not in the confidence interval.

    Inference: There is enough evidence to reject the claim that the survival rates are the same for day and night.
    R:
    prop.test(c(11604,4139),n=c(58593,28593),
                       alternative="two.sided",
                       conf.level=0.99,
                       correct=FALSE)$conf.int
    [1] 0.04645379 0.06012306
    attr(,"conf.level")
    [1] 0.99
    

  9. This crazy election season, it seems that many people are not planning on voting along party lines. A survey of eligible voters is taken with the following results: 24 out of 80 women and 11 out of 65 men say they will not vote for their party's candidate. Test the claim that this proportion is different for women and men. Use the critical value and the $P$-value method with significance level 0.10.

    Assumptions: $24,\ 56,\ 11,\ 54 \geq 5$

    $H_0: p_M=p_W \qquad H_1: p_M\not= p_W$

    Test statistic: $z=1.83$
    Critical value: $z=1.645$

    $P$-value $=.0672$

    Conclusion: Reject the null hypothesis because the test statistic is in the rejection region and the $P$-value $<\alpha$.

    Inference: There is enough evidence to support the claim that the proportion is different for men and women.

  10. A school nurse wants to determine whether there is a difference between boys and girls in the proportion who choose a healthy snack after school. The nurse conducts a random survey of students, and 67 of 150 girls chose a healthy snack, and 58 of 150 boys chose a healthy snack. Test at significance level 0.10 the claim that there is no difference in proportion between boys and girls. Use the confidence interval method.

    Assumptions: $67,\ 83,\ 58,\ 92 \geq 5$

    $H_0: p_G=p_B \qquad H_1:p_G\not=p_B $

    Confidence interval: $-.0335 \lt p_G-p_B \lt .1535$

    Conclusion: Fail to reject $H_0$ because 0 is in the confidence interval.

    Inference: There is not enough evidence to reject the claim that there is no difference in proportion between boys and girls.

  11. An advertising agency conducted a random survey of adults asking their primary source of news and educational level. $$\begin{array}{c|ccc|c} \textrm{Primary Source}&\textrm{Not High School}&\textrm{High School But}&\textrm{College}&\textrm{Total}\\ \textrm{of News}&\textrm{Graduate}&\textrm{Not College Graduate}&\textrm{Graduate}&\\ \hline \textrm{Newspapers}&49&205&188&442\cr \textrm{Television}&203&665&223&1091\cr \textrm{Internet}&41&401&245&687\cr \hline \textrm{Total}&293&1271&656&2220 \end{array}$$ Test the claim that the proportion of college graduates whose primary news source is the internet is higher than the proportion of high school (but not college) graduates whose news source is the internet. Use the $P$-value method with significance level 0.01.

    Assumptions: $401,\ 870,\ 245,\ 411 \geq 5$

    $H_0: p_C=p_H \qquad H_1: p_C>p_H $

    Test statistic: $z=2.55$
    $P$-value $=.0054$

    Conclusion: Reject the null hypothesis because the $P$-value $<\alpha$.

    Inference: There is enough evidence to support the claim that the proportion of college graduates whose primary news source is the internet is higher than the proportion of high school (but not college) graduates whose news source is the internet.

  12. A survey of 40 freshmen and 40 sophomores shows that 29 of the freshmen and 23 of the sophomores were satisfied with the food choices at Lil's.

    1. Find a $95\%$ confidence interval for the difference in the proportions of satisfied freshmen and sophomores.
    2. Using the confidence interval above, test the claim that there is no difference in the proportion of satisfied freshmen and sophomores. (Use $ \alpha=.05$.)
    1. Assumptions: $29,\ 11,\ 23,\ 17 \geq 5$; $E=.206$

      Confidence interval: $-.056 \lt p_F-p_S \lt .356$

      Interpretation: We are $95\%$ confident that the difference in the proportions of satisfied freshmen and sophomores is between -.056 and .356.

    2. The null hypothesis for testing this claim would be $H_0 : p_F - p_S = 0$. As this $0$ is in the confidence interval (as expected), we fail to reject the null hypothesis. There is no evidence of a difference in the proportions of satisfied freshmen and sophomores.

  13. Chantix is a drug used as an aid to stop smoking. The numbers of subjects experiencing insomnia for each of two treatment groups in a clinical trial of the drug Chantix are given below. Use the P-value method with a significance level of 0.05 to test the claim that there is more insomnia in the treatment group. $$\begin{array}{l|r|r} & \textrm{Chantix Treatment} & \textrm{Placebo}\\\hline \textrm{Number in Group} & 129 & 805\\\hline \textrm{Number Experiencing Insomnia} & 19 & 13\\\hline \end{array}$$

    $H_0 : p_{chantix} = p_{placebo}$; $H_1 : p_{chantrix} \gt p_{placebo}$; verify assumptions: $19,110,13,792 \ge 5$;

    $\displaystyle{\overline{p} = \frac{x_1 + x_2}{n_1 + n_2} = \frac{19 + 13}{129 + 805} \doteq 0.0343}$

    $\displaystyle{\overline{q} = 1 - \overline{p} \doteq 0.9657}$

    $\displaystyle{z = \frac{(\widehat{p}_1 - \widehat{p}_2) - (p_1 - p_2)}{\sqrt{\overline{p} \overline{q} \displaystyle{\left( \frac{1}{n_1} + \frac{1}{n_2} \right)}}}}$ $\displaystyle{ = \frac{\displaystyle{ \left( \frac{19}{129} - \frac{13}{805} \right) - 0}}{\displaystyle{\sqrt{(0.0343)(0.9657) \left( \frac{1}{129} + \frac{1}{805} \right)}}} \doteq 7.60}$

    The $p$-value is extremely small ($1.5 \times 10^{-14} \lt \alpha = 0.05$), so we reject the null hypothesis.

    The proportion with insomnia is significantly higher in the group treated with Chantix.

  14. A simple random sample of front-seat occupants involved in car crashes is obtained. Among 2823 occupants not wearing seat belts, 31 were killed. Among 7765 occupants wearing seat belts, 16 were killed. Construct a 90% confidence interval estimate of the difference between the fatality rates for those not wearing seat belts and those wearing seat belts. What does the result suggest about the effectiveness of seat belts?

    $\displaystyle{\widehat{p}_n = \frac{31}{2823} \doteq 0.0110}$

    $\displaystyle{\widehat{p}_s = \frac{16}{7765} \doteq 0.0021}$

    $\displaystyle{E = z_{\alpha/2} \sqrt{\frac{\widehat{p}_1 \widehat{q}_1}{n_1} +\frac{\widehat{p}_2 \widehat{q}_2}{n_2}} = 1.645 \sqrt{\frac{(0.0110)(0.9890)}{2823}+\frac{(0.0021)(0.9979)}{7765}} \doteq 0.0033}$

    $0.0056 \lt p_n - p_s \lt 0.0122$

    Thus, we are $90\%$ confident that the difference between the fatality rates for those not wearing seat belts and thos wearing seat belts in between $0.0056$ and $0.0122$. Seat belts appear to be effective in preventing fatalities. (The null hypothesis $p_n = p_s$ is rejected, since $p_n - p_s = 0$ is not in the confidence interval.)

  15. In Cleveland, a sample of 73 mail carriers showed that 10 had been bitten by an animal during one week. In Philadelphia, in a sample of 80 mail carriers, 16 had received animal bites. Is there a significant difference in the proportions? Use the confidence interval method with $\alpha = 0.05$.

    $\widehat{p}_c \doteq 0.1370$; $\widehat{p}_p = 0.2$; $z_{\alpha/2} \doteq 1.96$; $E \doteq 0.1179$; confidence interval : $-0.1809 \lt p_c - p_p \lt 0.0549$; fail to reject as $0$ is in the interval found; there is no significant evidence of a difference in proportions.

  16. Lipitor is a drug used to control cholesterol. In clinical trials of Lipitor, 94 subjects were treated with Lipitor and 270 subjects were given a placebo. Among those treated with Lipitor, 7 developed infections. Among those given a placebo, 27 developed infections. Use a $0.05$ significance level to test the claim that the rate of infections was the same for those treated with Lipitor and those given a placebo

    $H_0 : p_1 = p_2, H_1 : p_1 \ne p_2$; Test statistic $z = -0.73$. Critical values: $z = \pm 1.96$, $p$-value of $0.4638$; fail to reject $H_0$. There is no significant evidence that the rate of infections is different between the two groups.

  17. In 1993, a survey of 560 college students found that 171 said they used illegal drugs during the previous year. In a recent survey of 720 college students, 263 said they used illegal drugs during the previous year. Use a $0.05$ significance level to test the claim that the proportion of college students using illegal drugs has increased.

    $H_0 : p_1 = p_2, H_1 : p_1 \lt p_2$; Test statistic: $z = -2.25$; Critical value: $z = -1.645$; $p$-value: $0.0123$. reject the null hypothesis; there is evidence to support the claim that the proportion of college students using illegal drugs has increased.

  18. A simple random sample is taken of front-seat occupants involved in car crashes. Of the 2823 occupants not wearing seat belts, 31 were killed. Among the 7765 occupants wearing seat belts, 16 were killed. Construct a $90\%$ confidence interval estimate of the difference between the fatality rates for those not wearing seat belts and those wearing seat belts. What does the result suggest about the effectiveness of seat belts?

    $0.00558 \lt p_{nw} - p_{w} \lt 0.0123$; Because the confidence interval does not include $0$, it appears that the two fatality rates are not equal. Because the confidence interval consists of only positive values, it appears that the fatality rate is higher for those not wearing seat belts. The use of seat belts appears to be effective in saving lives.