A $\chi^2$ test with 3 degrees of freedom has significance level .10. Find the critical value.
Table: 6.251 R: > qchisq(0.90,3) [1] 6.251389 Excel: "=CHISQ.INV(0.9,3)" 6.251388631
A researcher wants to know whether responses to a statement (strongly agree, agree, no opinion, disagree, strongly disagree) are dependent on the gender of the interviewer. Which test should we use? Find the null hypothesis and the critical value at $\alpha=.01$.
Table: 13.277 R: > qchisq(0.99,4) [1] 13.2767 Excel: "=CHISQ.INV(0.99,4)" 13.27670414
An 8-sided die is rolled 200 times in order to test whether the die is fair. Which test should we use? Find the null hypothesis and check the assumptions for the test. Find the critical value at $\alpha=.05$.
Dr. Penta claims to have designed a five-sided die that is equally likely to land on sides 1 through 4, but lands the fifth side $40\%$ of the time.
You and a friend are munching on a bag of Harvest Blend M&M's, when your friend says, "There seems to be more yellow and brown candies than red and maroon candies. In fact, I claim there are $30\%$ yellow, $30\%$ brown, and only $20\%$ red and $20\%$ maroon." Together you count the remaining M&M's in the bag with the results below. Use the critical value method with significance level 0.05 to test your friend's claim.
$$\begin{array}{c|c|c|c|c|c} \hbox{Color}&\hbox{Yellow}&\hbox{Brown}&\hbox{Red}&\hbox{Maroon}&\hbox{Total}\\\hline \hbox{Number}&58&61&55&46&220 \end{array}$$R: # Test Statistic chisq.test(c(58,61,55,46),p=c(0.3,0.3,0.2,0.2))$statistic X-squared 4.189394 # Critical Value qchisq(0.95,3) [1] 7.814728
A sample of coin flips is collected from three different coins. The results are below. Use one hypothesis test to test the claim that all three coins have the same probability of landing heads. Use the critical value method with significance level 0.10.
$$\begin{array}{c|c|c|c} &\hbox{Coin A}&\hbox{Coin B}&\hbox{Coin C}\\\hline \hbox{Heads}&88&93&110 \\\hline \hbox{Tails} &112&107&90 \end{array}$$R: # Test Statistic > m = matrix(c(88,93,110,112,107,90),ncol=3,byrow=TRUE) > colnames(m) = c("Coin A","Coin B","Coin C") > rownames(m) = c("Heads","Tails") > summary(as.table(m))$statistic [1] 5.324792 # Critical Value > qchisq(0.90,2) [1] 4.60517
An advertising agency conducted a random survey of adults asking their primary source of news and educational level.
$$\begin{array}{c|ccc|c} \hbox{Primary Source}&\hbox{Not High School }&\hbox{High School But }&\hbox{College}&\hbox{Total}\cr \hbox{of News}&\hbox{Graduate}&\hbox{Not College Graduate}&\hbox{Graduate}&\cr \hline \hbox{Newspapers}&49&205&188&442\cr \hbox{Television}&203&665&223&1091\cr \hbox{Internet}&41&401&245&687\cr \hline \hbox{Total}&293&1271&656&2220 \end{array}$$The advertising company wants to test whether there is a relationship between the 3 educational levels and the 3 primary news sources. Find the null hypothesis and degrees of freedom for the test. Show that the assumptions for the test are met for the category: "Newspapers/Not High School Graduate".
Test the claim that among college graduates, their primary news source is equally divided among newspapers, television, and the internet. Use the critical value method with significance level 0.05.
$H_0:$ The primary news source and educational level are independent.
Degrees of freedom $=(3-1)(3-1)=4$
$E=\displaystyle{(442)(293)\over 2220}=58.3\geq 5$
Assumptions: $218.7 \geq 5$
$H_0: p_N=p_T=p_I=1/3$
Test statistic: $ \chi^2=7.557$
Critical value: $ 5.991$
Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.
Inference: There is enough evidence to reject the claim that among college graduates, their primary news source is equally divided among newspapers, television, and the internet.
A school nurse wants to determine whether age is a factor in whether children choose a healthy snack after school. She conducts a survey of 300 middle school students, with the results below. Test at $\alpha=.05$ the claim that the proportion who choose a healthy snack differs by grade level. Use the critical value method.
$$\begin{array}{c|c|c|c} \hbox{Grade level: } &\hbox{6th grade} &\hbox{7th grade} &\hbox{8th grade}\cr\hline \hbox{Healthy snack} &31 &43 &51 \cr\hline \hbox{Unhealthy snack} &69 & 57 & 49 \end{array}$$A survey asked adults nationwide if they thought that the federal government should continue to fund unmanned missions to Mars. Fifty-six percent said they should continue, $40\%$ said they should not continue, and $4\%$ had no opinion. A random sample of 200 college students resulted in the numbers below. At significance level 0.05, test the claim that the opinions of college students on this issue differ from those of the nation as a whole. $$\begin{array}{c|c|c} \hbox{Should continue}&\hbox{Should not continue} &\hbox{No opinion}\cr\hline 126&65&9 \end{array}$$
To test the claim that snack choices are related to the gender of the consumer, a survey at a ball park shows this selection of snacks purchased. Write the null hypothesis and check the assumptions. Do not do the rest of the hypothesis test. $$\begin{array}{c|c|c|c} &\hbox{Hotdog} &\hbox{Peanuts} &\hbox{Popcorn}\cr\hline \hbox{Male}&6&12&9\cr\hline \hbox{Female}&5&5&8 \end{array}$$
As part of the 1999 College Alcohol Study, 11160 students who drank alcohol in the last year were asked if drinking ever resulted in missing a class. The data are given in the following table: $$\begin{array}{c|ccc} \hbox{Missed a class?}&\hbox{Non-binger}&\hbox{Occasional binger}&\hbox{Frequent binger}\cr \hline \textrm{Yes} & 446 & 915 & 1959 \cr \textrm{No} & 4617 & 2047 & 1176 \cr \end{array}$$
Is the proportion of missed classes related to students drinking habits?
Find a $99\%$ confidence interval for the proportion of nonbinger students who missed classes.
If we wanted a confidence interval for the proportion of occasional binger students who did not miss class with a $5\%$ margin of error and confidence level $98\%$, how large a sample would we need?
$H_0:$ Student's drinking habits and the number of classes missed are independent.
Check that all $E\geq5$.
d.f. $=2$; C.V. $=5.991$
The test statistic is $\chi^2=2672$.
We reject $H_0$. There is incredibly significant evidence that the proportion of missed classes is related to one's drinking habit.
$\widehat{p}=\dfrac{446}{11160}\approx 0.04 \qquad \widehat{q}=0.96$
$\alpha/2=0.005 \Longrightarrow z_{\alpha/2}=2.575$
$E = z_{\alpha/2}\sqrt{\widehat{p} \widehat{q} \over n}=2.575\sqrt{\dfrac{0.04\cdot 0.96}{11160}}\approx 0.0048$
Confidence interval: $(0.0352,0.0448)$
We are $99\%$ confident that the proportion of non-binger students who missed classes is between .0352 and 0.0448.
$\widehat{p}=0.1834 \qquad \widehat{q}=0.8166$ $\alpha/2=0.01$ $n=\widehat{p}\widehat{q}\left({z_{\alpha/2} \over E}\right)^2=0.1834 \cdot 0.8166 \left({2.33 \over 0.05}\right)^2\approx 326$
A game where colored marbles are drawn out of a bag with replacement has three possible outcomes: red, green, and blue. The game is played 100 times with the results shown below. Using $\alpha= 0.05$, test the claim that the probabilities for each outcome are as follows: P(red) = .40, P(green) = .35, and P(blue) = .25. $$\begin{array}{c|ccc} \hbox{Color} &\hbox{Red} &\hbox{Green} &\hbox{Blue}\cr\hline \hbox{Number of occurences}& 32& 45& 23 \end{array}$$
Using the data below, test the claim that there is no difference in the color preferences of men and women. Use $\alpha = .05$. $$\begin{array}{c|ccc} \hbox{Preferred Color:} &\hbox{Red} &\hbox{Yellow} &\hbox{Blue}\cr \hline \hbox{Men}& 21&34& 45\cr \hbox{Women}& 36 &33&31 \end{array}$$
A researcher wishes to see if the five ways (drinking caffeinated beverages, taking a nap, going for a walk, eating a sugary snack, other) people use to combat midday drowsiness are equally distributed among office workers. A sample of 60 office workers is selected, and the following data are obtained. At .10 significance level can it be concluded that there is no preference? $$\begin{array}{l|c|c|c|c|c} \textrm{Method} & \textrm{beverage} & \textrm{nap} & \textrm{walk} & \textrm{snack} & \textrm{other}\\\hline \textrm{Number} & 21 & 16 & 10 & 8 & 5 \end{array}$$
If there is no preference, than all are equally likely. As there are 5 categories, the expectation is that they all occur with probability $0.20$.
Assumptions: $12 \ge 5$
$H_0$: There is no preference for a way to combat midday drowsiness
Test statistic: $\chi^2 = 13.83$
Critical value: $7.779$
Conclusion: Reject the null hypothesis as the test statistic is in the rejection region.
Inference: There is significant evidence that the 5 methods to combat midday drowsiness are not all equally likely.
Nationwide the shares of carbon emissions for the year 2000 are transportation, 33%; industry, 30%; residential, 20%; and commercial, 17%. A state hazardous materials official wants to see if her state is the same. Her study of 300 emissions sources finds transportation, 36%; industry, 31%; residential, 17%; and commercial, 16%. At a 0.05 significance level, can she claim the percentages are the same?
$H_0$: The percentages are the same
Check Assumptions:
Assumptions met as all calculated expected counts are $\ge 5$:
$\displaystyle{ \begin{array}{ll} \textrm{transportation} & (0.33)(300) = 99 \ge 5\\ \textrm{industry} & (0.30)(300) = 90 \ge 5\\ \textrm{residential} & (0.20)(300) = 60 \ge 5\\ \textrm{commercial} & (0.17)(300) = 51 \ge 5 \end{array}}$
We must similarly calculate the observed counts to find the test statistic:
$\displaystyle{ \begin{array}{ll} \textrm{transportation} & (0.36)(300) = 108\\ \textrm{industry} & (0.31)(300) = 93\\ \textrm{residential} & (0.17)(300) = 51\\ \textrm{commercial} & (0.16)(300) = 48 \end{array}}$
Test statistic:
$\displaystyle{\chi^2 = \frac{(108-99)^2}{99} + \frac{(93-90)^2}{90} + \frac{(51-60)^2}{60} + \frac{(48-51)^2}{51} \doteq 2.4447}$
Critical value:
$\displaystyle{\left\{ \begin{array}{c} \textrm{degrees of freedom } = n-1 = 3\\ \alpha=0.05 \end{array} \right\} \rightarrow 7.815}$Conclusion: Fail to reject the null hypothesis as the test statistic was not in the rejection region.
Inference: There is no significant evidence that the state percentages are not the same as the national percentages.
A study is conducted as to whether there is a relationship between joggers and the frequency of consumption of nutritional supplements. A random sample of 210 subjects is selected, and they are classified as shown. At a 0.05 significance level, test the claim that jogging and the consumption of supplements are not related. $$\begin{array}{lccc} & \textrm{Daily} & \textrm{Weekly} & \textrm{As Needed}\\\hline \textrm{Joggers} & 34 & 52 & 23\\ \textrm{Non-joggers} & 18 & 65 & 18 \end{array}$$
$H_0$: jogging and the consumption of supplements are not related.
Expected counts:daily weekly as-needed joggers 26.99048 60.72857 21.28095 non-joggers 25.00952 56.27143 19.71905
Assumptions are met. (all $E \ge 5$)
Test statistic: $\chi^2 = 6.68$
Critical value: 5.991
Conclusion: Reject the null hypothesis as the test statistic is in the rejection region
Inference: There is significant evidence that jogging and the consumption of supplements are related.
An advertising firm has decided to ask 92 customers at each of three local shopping malls if they are willing to take part in a market research survey. According to previous studies, 38% of Americans refuse to take part in such surveys. The results are shown here. At a 0.01 significance level, test the claim that the proportions of those who are willing to participate are equal.
$$\begin{array}{lccc} & \textrm{Mall A} & \textrm{Mall B} & \textrm{Mall C}\\\hline \textrm{Will Participate} & 52 & 45 & 36\\ \textrm{Will Not Participate} & 40 & 47 & 56\\ \end{array}$$$H_0$: the proportions of those who are willing to participate are equal among the 3 malls
Check Assumptions:
We need to first calculate the expected counts using the marginal totals:
$$\begin{array}{l|ccc|c} & \textrm{Mall A} & \textrm{Mall B} & \textrm{Mall C} & \textrm{Total}\\\hline \textrm{Will Participate} & 52 & 45 & 36 & 133\\ \textrm{Will Not Participate} & 40 & 47 & 56 & 143\\\hline \textrm{Total} & 92 & 92 & 92 & 276 \end{array}$$Then the expected counts are given by:
$$\begin{array}{lccc} & \textrm{Mall A} & \textrm{Mall B} & \textrm{Mall C}\\\hline \textrm{Will Participate} & \frac{(133)(92)}{276} = 44.3 & \frac{(133)(92)}{276} = 44.3 & \frac{(133)(92)}{276} = 44.3\\ \textrm{Will Not Participate} & \frac{(143)(92)}{276} = 47.7 & \frac{(143)(92)}{276} = 47.7 & \frac{(143)(92)}{276} = 47.7\\ \end{array}$$Assumptions are met (all $E \ge 5$)
Test statistic:
$$\begin{array}{rcl}
\chi^2 & = & \displaystyle{\frac{(52-44.3)^2}{44.3} + \frac{(45-44.3)^2}{44.3} + \frac{(36-44.3)^2}{44.3} + \cdots}\\\\
& & \displaystyle{\frac{(40-47.7)^2}{47.7} + \frac{(47-47.7)^2}{47.7} + \frac{(56-47.7)^2}{47.7}}\\\\
& \doteq & 5.6016
\end{array}$$
Critical value:
$\displaystyle{\left\{ \begin{array}{c} \textrm{degrees of freedom } = (r-1)(c-1) = (2-1)(3-1) = 2\\ \alpha=0.01 \end{array} \right\} \rightarrow 9.210}$Conclusion: Fail to reject the null hypothesis as the test statistic is not in the rejection region.
Inference: There is no significant evidence that the proportions who participate are not the same in all three locations.
A researcher wishes to see if the proportions of workers for each type of job have changed during the last 10 years. A sample of 40 workers is selected, and the results are shown. At a 0.05 significance level, test the claim that the proportions have not changed.
$$\begin{array}{lcccc} & \textrm{Services} & \textrm{Manufacturing} & \textrm{Government} & \textrm{Other}\\\hline \textrm{10 years ago} & 56\% & 21\% & 18\% & 5\%\\ \textrm{Now} & 18 & 12 & 8 & 2\\ \end{array}$$$H_0$: the proportions have not changed
Assumptions are not met. The expected count in the "Other" category is $2 \not\ge5$.
One should not proceed with a chi-square goodness of fit test.
Test the claim that births are uniformly distributed among the months (i.e., one twelfth of the number of births occur on average in any one month), using the following data collected over the course of one year.
$$\begin{array}{lr|lr} \textrm{Jan} & 34 & \textrm{Jul} & 36\\ \textrm{Feb} & 31 & \textrm{Aug} & 38\\ \textrm{Mar} & 35 & \textrm{Sep} & 37\\ \textrm{Apr} & 32 & \textrm{Oct} & 36\\ \textrm{May} & 35 & \textrm{Nov} & 35\\ \textrm{Jun} & 35 & \textrm{Dec} & 35\\ \end{array}$$$H_0$: births are uniformly distributed among the months
$419$ births equally uniformly distributed would create an expectation of 34.916 births in each month.
Assumptions met: $34.916 \ge 5$.
Test statistic: $\chi^2 = 1.1718$
Critical value: 19.675
Conclusion: Fail to reject the null hypothesis as the test statistic in not in the rejection region.
Inference: There is no significant evidence that the births are not uniformly distributed among the months.
Based on the following data from the doomed voyage of the Titanic. decide if the chances that a randomly selected passenger survived was independent of their status.
$$\begin{array}{l|cccc|c} & \textrm{Crew} & \textrm{1st Class} & \textrm{2nd Class} & \textrm{3rd Class} & \textrm{Total} \\\hline \textrm{Lived} & 212 & 202 & 118 & 178 & 710\\ \textrm{Died} & 673 & 123 & 167 & 528 & 1491\\\hline \textrm{Total} & 885 & 325 & 285 & 706 & 2201\\ \end{array}$$$H_0$: The chances that a randomly selected passenger survived was independent of their status
Assumptions met as calculated expectations below are all $\ge 5$:
Crew 1st Class 2nd Class 3rd Class Lived 285.4839 104.8387 91.93548 227.7419 Died 599.5161 220.1613 193.06452 478.2581
Test statistic: $\chi^2 = 187.79$
Critical Value: degrees freedom $(4-1)(2-1) = 3$ and $\alpha = 0.05$ (default) tells us the critical value is 7.815.
Conclusion: Reject the null hypothesis as the test statistic is in the rejection region.
Inference: There is evidence that passenger's survival is related to their status.
Decide if the proportions of Democrats, Republicans, and Independents are the same for both men and women, based on the following sample data. $$\begin{array}{l|ccc} & \textrm{Democrat} & \textrm{Republican} & \textrm{Independent}\\\hline \textrm{Male} & 36 & 45 & 24\\ \textrm{Female} & 48 & 33 & 16\\ \end{array}$$
$H_0$: The proportions of democrats, republicans, and independents are the same for both men and women
Assumptions met as calculated expectations below are all $\ge 5$:
Democrat Republican Independent Male 43.66337 40.54455 20.79208 Female 40.33663 37.45545 19.20792
Test statistic: $\chi^2 = 4.8512$
Critical Value: $5.991$ (at default $\alpha = 0.05$)
Conclusion: Fail to reject the null hypothesis, as the test statistic is not in the rejection region.
Inference: There is no significant evidence that the proportions of Democrats, Republican, and Independents are different for men and women.
It is a common belief that more fatal car crashes occur on certain days of the week, such as Friday or Saturday. A sample of motor vehicle deaths is randomly selected for a recent year. The number of fatalities for the different days of the week are listed below. At the $0.05$ significance level, test the claim that accidents occur with equal frequency on the different days. State the null hypothesis, test statistic, critical value, your conclusion and interpretation. $$\begin{array}{l|c|c|c|c|c|c|c|} \textrm{Day} & \textrm{Sun} & \textrm{Mon} & \textrm{Tue} & \textrm{Wed} & \textrm{Thu} & \textrm{Fri} & \textrm{Sat}\\\hline \textrm{Number of Fatalities} & 31 & 20 & 20 & 22 & 22 & 29 & 26\\\hline \end{array}$$
In a study of drug abuse in a local high school, the school board selected 100 eighth graders, 100 sophomores and 100 seniors randomly from their respective rolls for each grade. Each student was then asked if they used a particular drug frequently, seldom or never. The data are summarized in the table given below. Is there evidence to suggest that the frequency of drug use is the same across the three different grades? State the null hypothesis, give the test statistic, test criterion, conclusion, and interpretation.
Frequency of Drug Use
In an experiment on extrasensory perception, subjects were asked to identify the month showing on a calendar in the next room. If the results were as shown, test the claim that months were selected with equal frequencies. Assume a significance level of $0.05$, If it appears that the months were not selected with equal frequencies, is the claim that the subjects have extrasensory perception supported? $$\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|} \textrm{Jan} & \textrm{Feb} & \textrm{Mar} & \textrm{Apr} & \textrm{May} & \textrm{Jun} & \textrm{Jul} & \textrm{Aug} & \textrm{Sep} & \textrm{Oct} & \textrm{Nov} & \textrm{Dec}\\\hline 23 & 21 & 35 & 31 & 22 & 41 & 12 & 14 & 10 & 26 & 30 & 24\\\hline \end{array}$$
You suspect that a die is unfair. Your roll it 60 times and get the following results: $$\begin{array}{l|c|c|c|c|c|c|} \textrm{Number on die} & 1 & 2 & 3 & 4 & 5 & 6\\\hline \textrm{Observed frequency} & 10 & 12 & 14 & 8 & 12 & 4\\\hline \end{array}$$ Determine if the above distribution is significantly different from the expected distribution assuming that the die is fair.
Students at Oxford were asked to indicate their agreement with the following statement: "I find mathematics challenging but I am able to make a good grade." Is there a difference in the distributions of responses between males and females? Students responded as follows: $$\begin{array}{l|c|c|c|c|} & \textrm{agree} & \textrm{no opinion} & \textrm{disagree} & \textrm{total}\\\hline \textrm{males} & 75 & 10 & 85 & 170\\\hline \textrm{females} & 121 & 8 & 51 & 180\\\hline \end{array}$$ Give the null hypothesis, test statistic, critical value at an appropriate alpha level, conclusion, and interpretation.
Students were asked to respond to the following statement: "Participating in study groups is an effective way to study for some courses." Is there a significant difference in the responses of freshmen and sophomores? Show appropriate hypothesis testing responses. $$\begin{array}{l|c|c|c|} & \textrm{agree} & \textrm{no opinion} & \textrm{disagree}\\\hline \textrm{Freshmen} & 34 & 21 & 35\\\hline \textrm{Sophomore} & 54 & 12 & 29\\\hline \end{array}$$
A pair of dice was rolled 500 times. The sums that occurred were as recorded in the following table. Test whether the dice seem fair based on this data. For example, $P(2,3,\textrm{ or } 4) = 1/6$ and the sums $2$, $3$, and $4$ occurred at total of $74$ times. Since the dice were rolled $500$ times, one would expect $83.3$ ($500 \times 1/6 \approx 83.3$) occurrences of rolling a $2$, $3$, or $4$, so $83.3$ is the expected value. $$\begin{array}{l|c|c|c|c|c|} \textrm{Sum} & \{2,3,4\} & \{5,6\} & \{7\} & \{8,9\} & \{10,11,12\}\\\hline \textrm{Frequency (Observed)} & 74 & 120 & 83 & 135 & 88\\\hline \end{array}$$ Now rework this problem using the actual observed values for each sum: $$\begin{array}{l|c|c|c|c|c|c|c|c|c|c|c|} \textrm{Sum} & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12\\\hline \textrm{Observed} & 12 & 26 & 36 & 58 & 62 & 83 & 102 & 33 & 20 & 9 & 59\\\hline \end{array}$$ Did you find that testing the die this way was significant? Which way would be the best for determining if a die were fair?