## Exercises - Kruskal-Wallis, Wilcoxon Tests 2

1. The number of grams of carbohydrates contained in 1-ounce servings of randomly selected chocolate and non-chocolate candy is listed below. The variances of the samples are significantly different. Test at $\alpha=.05$ the claim that there is no difference in carbohydrate content in the two kinds of candy. $$\begin{array}{ r|ccccccccccccc} \hbox{Chocolate: }&17&24&25&25&27&29&29&29&32&34&36&38&41\cr \hbox{Non-chocolate: }&10&12&29&29&30&37&38&39&41&41&55 \end{array}$$ Test at $\alpha=.05$ the claim that there is no difference in carbohydrate content in the two kinds of candy.

Assumptions: both $n\geq 10$

Null hypothesis: No difference in the carbohydrate content in the two kinds of candy. $$\begin{array}{ r|cccccccccccccc} \hbox{Chocolate: }&17&24&25&25&27&29&29&29&32&34&36&38&41\cr \hbox{Rankings} &3&4&5.5&5.5&7&10&10&10&14&15&16&18.5&22&(R=140.5)\cr\cr \hbox{Non-chocolate: }&10&12&29&29&30&37&38&39&41&41&55\cr \hbox{Rankings} &1&2&10&10&13&17&18.5&20&22&22&24&&&(R=159.5) \end{array}$$ Test statistic: $z=1.27$ or $z=-1.27$ (There is only one test statistic, but either one is possible.)

Critical Value: $\pm 1.96$

Fail to reject the null hypothesis. The test statistic is not in the rejection region.

There is not enough evidence to reject the claim that there is no difference in carbohydrate content in the two kinds of candy.

2. To determine whether a significant difference exists in the lengths of fish from two hatcheries, 11 fish were randomly selected from hatchery A, and 10 fish were randomly selected from hatchery B. Their lengths, in centimeters, are given below. The variances of the data sets are significantly different. Test the claim that there is no difference in the fish lengths for the two hatcheries using the critical value method with $\alpha=.05$. $$\begin{array}{lcccccccccc} \textrm{Hatchery A} & 12.4 & 12.7 & 12.9 & 13.3 & 14.2 & 14.3 & 14.3 & 14.8 & 14.8 & 15.3 & 15.3\\ \textrm{Hatchery B} & 10.7 & 12.2 & 12.8 & 13.9 & 14.1 & 14.3 & 14.6 & 15.6 & 16.8 & 18.1 & \end{array}$$

Assumptions: both $n\geq 10$

Null hypothesis: No difference in the fish lengths for the two hatcheries $$\begin{array}{ r|cccccccccccc} \hbox{Hatchery A:} & 12.4 & 12.7 & 12.9 & 13.3 & 14.2 & 14.3 & 14.3 & 14.8 & 14.8 & 15.3 & 15.3\\ \hbox{Rankings} &3&4&6&7&10&12&12&15.5&15.5&17.5&17.5&(R=120)\cr\cr \hbox{Hatchery B:} & 10.7 & 12.2 & 12.8 & 13.9 & 14.1 & 14.3 & 14.6 & 15.6 & 16.8 & 18.1\\ \hbox{Rankings} &1&2&5&8&9&12&14&19&20&21&&(R=111) \end{array}$$ Test statistic: $z=0.704$ or $z=-0.704$ (There is only one test statistic, but either one is possible.)

Critical Value: $\pm 1.96$

Fail to reject the null hypothesis. The test statistic is not in the rejection region.

There is no significant difference in the fish lengths for the two hatcheries.

3. A tax collector wishes to compare the values of tax-exempt properties for two large cities. The values for two random samples are shown. The variances of the data sets are significantly different. Test the claim that the property values are the same in the two cities at $\alpha=.05$. $$\begin{array}{r|ccccccccccccccc} \hbox{City A: }&2&5&7&8&11&14&19&19&22&23&23&25&30&31&44\cr \hbox{City B: }&2&4&5&5&9&11&12&17&17&19&20&40&51&52&68 \end{array}$$

Assumptions: both $n\geq 10$

Null hypothesis: No difference in the property values of the 2 cities. $$\begin{array}{r|cccccccccccccccc} \hbox{City A: }&2&5&7&8&11&14&19&19&22&23&23&25&30&31&44\cr \hbox{Rankings} &1.5&5&7&8&10.5&13&17&17&20&21.5&23&24&25&27&&(R=241)\cr\cr \hbox{City B: }&2&4&5&5&9&11&12&17&17&19&20&40&51&52&68\cr \hbox{Rankings} &1.5&3&5&5&9&10.5&12&14.5&14.5&17&19&26&28&29&30&(R=224) \end{array}$$ Test statistic: $z=0.35$ or $z=-0.35$ (There is only one test statistic, but either one is possible.)

Critical Value: $\pm 1.96$

Fail to reject the null hypothesis. The test statistic is not in the rejection region.

There is no significant difference in the property values of the 2 cities.

4. Two groups of employees were given a questionnaire to ascertain their degree of job satisfaction. The scale ranged from 0 to 100. The groups were divided into those who had under 5 years of work experience and those who had 5 or more years of experience. Test the claim that there is no difference in the job satisfaction of the two groups as measured by the questionnaire. Use the $p$-value method with significance level 0.05. Why is a parametric test inappropriate? $$\begin{array}{r|ccccccccccccc} \hbox{Under 5:}&56&68&72&75&77&77&83&86&93&93&97&98&99\cr\cr \hbox{5 and over:}&52&56&59&63&64&66&68&73&79&82&85&93&94 \end{array}$$

Use a non-parametric test because the data is ordinal.

Assumptions: both $n\geq 10$

Null hypothesis: No difference in the job satisfaction of the two groups $$\begin{array}{r|cccccccccccccc} \hbox{Under 5:}&56&68&72&75&77&77&83&86&93&93&97&98&99\\ \hbox{Rankings} &2.5&8.5&10&12&13.5&13.5&17&19&21&21&24&25&26&\qquad R=213\\\\ \hbox{5 and over:}&52&56&59&63&64&66&68&73&79&82&85&93&94\\ \hbox{Rankings}&1&2.5&4&5&6&7&8.5&11&15&16&18&21&23& \qquad R=138 \end{array}$$ Test statistic: $z=1.92$ or $z=-1.92$ (There is only one test statistic, but either one is possible.)

$p$-Value: $0.0548$

Fail to reject the null hypothesis. The $p$-value is greater than $\alpha$.

There is no significant difference in the job satisfaction of the two groups.

5. A grocery store conducts a survey asking customers to rate (on a scale of 1 to 10) one of two different brands of canned cranberry sauce. Determine whether there is a difference in ratings between the brands. Use the $p$-value method with significance level 0.05. $$\begin{array}{r|cccccccccccc} \hbox{Ratings for Brand A:} &2&3&4&4&5&5&5&6&7&8\cr\cr \hbox{Ratings for Brand B:}&1&4&4&5&5&6&6&7&7&7&7&8 \end{array}$$

Use a non-parametric test because the data is ordinal.

Assumptions: both $n\geq 10$

Null hypothesis: No difference in ratings between the brands. $$\begin{array}{r|ccccccccccccc} \hbox{Ratings for Brand A:} &2&3&4&4&5&5&5&6&7&8\\ \hbox{Rankings}&2&3&5.5&5.5&10&10&10&14&18&21.5&&& (R=99.5)\\\\ \hbox{Ratings for Brand B:}&1&4&4&5&5&6&6&7&7&7&7&8\\ \hbox{Rankings}&1&5.5&5.5&10&10&14&14&18&18&18&18&21.5& (R=153.5) \end{array}$$ Test statistic: $z=1.02$ or $z=-1.02$ (There is only one test statistic, but either one is possible.)

$p$-Value: $0.3078$

Fail to reject the null hypothesis. The $P$-value is greater than $\alpha$.

There is no significant difference in ratings between the brands.

6. Random samples of 3 brands of chocolate chip cookies are obtained and the number of chips in each cookie is recorded. Assume the distributions are approximately normal. $$\begin{array}{r|ccccccc} \hbox{Brand A:} &12&13&13&14&14&15&17\cr \hbox{Brand B:} &10&12&14&15&18&20&21\cr \hbox{Brand C:} &9&10&10&11&13&14&14 \end{array}$$

1. Show that the variance of Brand A is significantly different from the variance of Brand B at $\alpha=.05$.

2. Test the claim that the number of chocolate chips differs among the 3 brands at $\alpha=.05$. Choose an appropriate test based on the fact that variances are significantly different.

1. $H_0: \sigma^2_A=\sigma^2_B \qquad H_1: \sigma^2_A\not=\sigma^2_B$

Test statistic: $F=6.36$

Critical value: $5.82$ (dfN=6, dfD=6, .025 in each tail)

Reject the null hypothesis. The test statistic is in the rejection region.

The variances are significantly different.

2. Use the non-parametric Kruskal-Wallis test.

Assumptions: all $n\geq 5$

Null hypothesis: No difference in the number of chocolate chips among the 3 brands. $$\begin{array}{r|cccccccc} \hbox{Brand A:} &12&13&13&14&14&15&17\cr \hbox{Rankings}&6.5&9&9&13&13&16.5&18& \qquad R=85\cr\cr \hbox{Brand B:} &10&12&14&15&18&20&21\cr \hbox{Rankings}&3&6.5&13&16.5&19&20&21& \qquad R=99\cr\cr \hbox{Brand C:} &9&10&10&11&13&14&14\cr \hbox{Rankings}&1&3&3&5&9&13&13& \qquad R=47 \end{array}$$ Test statistic: $H=5.37$

Critical Value: $\chi^2=5.991$ Fail to reject the null hypothesis. The test statistic is not in the rejection region.

There is not enough evidence to support the claim that the number of chocolate chips differs among the 3 brands.

7. A study was conducted of lead levels in children living close to a lead smelter. The blood lead level of each child was measured and also their IQ score. Use the data below to test the claim that there is a difference in IQ score for the three groups of children. Use the non-parametric test at $\alpha=0.01$. $$\begin{array}{r|ccccccccc} \hbox{Low lead level}&76&76&85&86&89&95&96&102&108\\ \hbox{Medium lead level}&78&82&92&97&111\\ \hbox{High lead level}&75&76&79&80&96 \end{array}$$

Assumptions: all $n\geq 5$

Null hypothesis: No difference in IQ score for the three groups of children. $$\begin{array}{r|cccccccccc} \hbox{Low lead level}&76&76&85&86&89&95&96&102&108\\ \hbox{Rankings}&3&3&9&10&11&13&14.5&17&18& \qquad R=98.5\\\\ \hbox{Medium lead level}&78&82&92&97&111\\ \hbox{Rankings}&5&8&12&16&19&&&&& \qquad R=60\\\\ \hbox{High lead level}&75&76&79&80&96\\ \hbox{Rankings}&1&3&6&7&14.5&&&&& \qquad R=31.5 \end{array}$$ Test statistic: $H=3.047$

Critical Value: $\chi^2=9.210$

Fail to reject the null hypothesis. The test statistic is not in the rejection region.

There is not enough evidence to support the claim that there is a difference in IQ score for the three groups of children.

8. A researcher wishes to try three different techniques to lower the blood pressure of individuals diagnosed with high blood pressure. The subjects are randomly assigned to three groups; the first group takes medication, the second group exercises, and the third group follows a special diet. After four weeks, the reduction in each person's blood pressure is recorded. Is there a significant difference between the techniques used to lower blood pressure? Use a non-parametric test at $\alpha=0.05$. $$\begin{array}{r|ccccc} \hbox{Medication group}&9&10&12&13&15\\ \hbox{Exercise group} &0&2&3&6&8\\ \hbox{Diet group} &4&5&8&9&12 \end{array}$$

Assumptions: all $n\geq 5$

Null hypothesis: No difference between the techniques used to lower blood pressure $$\begin{array}{r|cccccc} \hbox{Medication group}&9&10&12&13&15\\ \hbox{Rankings}&9.5&11&12.5&14&15& \qquad R=62\\\\ \hbox{Exercise group} &0&2&3&6&8\\ \hbox{Rankings}&1&2&3&6&7.5& \qquad R=19.5\\\\ \hbox{Diet group} &4&5&8&9&12 \\ \hbox{Rankings}&4&5&7.5&9.5&12.5& \qquad R=38.5 \end{array}$$ Test statistic: $H=12.065$

Critical Value: $\chi^2=5.991$

Reject the null hypothesis. The test statistic is in the rejection region.

There is enough evidence to support the claim that there is a difference between the techniques used to lower blood pressure.

9. A meteorologist wishes to see if there is a difference in the number of deaths in the United States due to different types of severe weather. The data from 6 years are shown here. $$\begin{array}{r|cccccc} \hbox{Lightning}&39&41&67&68&73&74\\ \hbox{Tornado}&30&32&39&39&50&53\\ \hbox{Blizzard}&35&39&43&48&54&56 \end{array}$$

1. Using the non-parametric test at $\alpha=.10$, is there a difference in the number of deaths from the different weather conditions?

2. Describe the follow-up procedure for finding where the difference lies. Explain why we should not attempt this procedure for this problem.

1. Assumptions: all $n\geq 5$

Null hypothesis: No difference in the number of deaths from the different weather conditions. $$\begin{array}{r|ccccccc} \hbox{Lightning}&39&41&67&68&73&74\\ \hbox{Rankings} &5.5&8&15&16&17&18&\qquad R=79.5\\\\ \hbox{Tornado}&30&32&39&39&50&53\\ \hbox{Rankings} &1&2&5.5&5.5&11&12&\qquad R=37\\\\ \hbox{Blizzard}&35&39&43&48&54&56\\ \hbox{Rankings} &3&5.5&9&10&13&14&\qquad R=54.5 \end{array}$$ Test statistic: $H=5.34$

Critical Value: $\chi^2=4.605$

Reject the null hypothesis. The test statistic is in the rejection region.

There is enough evidence to support the claim that there is a difference in the number of deaths from the different weather conditions

2. We should do Wilcoxon tests on the three pairs of samples. But since the sample sizes are less than 10, the assumptions are not met for the Wilcoxon test.