The statistical-inference procedures discussed in this thread rely on a distribution called the chi-square distribution. A variable has a chi-square distribution if its distribution has the shape of a special type of right-skewed curve, called a chi-square curve. Actually, there are infinitely many chi-square distributions, and we identify the chi-square distribution in question by its number of degrees of freedom, just as we did for t-distributions.
Basic properties of chi-square curves
- The total area under a chi-square-curve equals 1.
- A chi-square-curve starts at 0 on the horizontal axis and extends indefinitely to the right, approaching, but never touching, the horizontal axis.
- A chi-square-curve is right skewed.
- As the number of degrees of freedom becomes larger, chi-square-curves look increasingly like normal curves.
Chi-Square Goodness-of-Fit Test
Our first chi-square procedure is called the chi-square goodness-of-fit test. We can use this procedure to perform a hypothesis test about the distribution of a qualitative (categorical) variable or a discrete quantitative variable that has only finitely many possible values. Next, let we describe the logic behind the chi-square goodness-of-fit test by an example.
The FBI compiles data on crimes and crime rates and publishes the information in Crime in United States. A violent crime is classified by the FBI as murder, forcible rape, robbery, or aggravated assault. Table 13.1 gives a relative-frequency distribution for (reported) violent crimes in 2010. For instance, in 2010, 29.5% of violent crimes were robberies.
A simple random sample of 500 violent-crime reports from last year yielded the frequency distribution shown in Table 13.2. Suppose that we want to sue the data in Table 13.1 and 13.2 decide whether last year’s distribution of violent crimes is changed from the 2010 distribution.
The idea behind the chi-square goodness-of-fit test is to compare the observed frequencies in the second column of Table 13.2 to the frequencies that would be expected – the expected frequencies – if last year’s violent-crime distribution is the same as the 2010 distribution. If the observed and expected frequencies match fairly well (i.e., each observed frequency is roughly equal to its corresponding expected frequency), we do not reject the null hypothesis; otherwise, we reject the null hypothesis.
To formulate a precise procedure for carrying out the hypothesis test, we need to answer two questions: 1) What frequencies should we expect from a random sample of 500 violent-crime reports from last year if last year’s violent-crime distribution is the same as the 2010 distribution? 2) How do we decide whether the observed and expected frequencies match fairly well?
The first question is easy to answer, which we illustrate with robberies. If last year’s violent-crime distribution is the same as the 2010 distribution, then, according to Table 13.1, 29.5% of last year’s violent crimes would have been robberies. Therefore, in a random sample of 500 violent-crime reports from last year, we would expect about 29.5% of the 500 to be robberies. In other words, we would expect the number of robberies to be 500 * 0.295, or 147.5.
In general, we compute each expected frequency, denoted E, by using the formula, E = np, where n is the sample size and p is the appropriate relative frequency from the second column of Table 13.1. Using this formula, we calculated the expected frequencies for all four types of violent crime. The results are displayed in the second column of Table 13.3.
The second column of Table 13.3 answer the first question. It gives the frequencies that we would expect if last year’s violent-crime distribution is the same as the 2010 distribution. The second question – whether the observed and expected frequencies match fairly well is harder to answer. We need to calculate a number that measures the goodness-of-fit.
In Table 13.4, the second column repeats the observed frequencies from the second column of Table 13.2. The third column of Table 13.4 reports the expected frequencies from the second column of Table 13.3. To measure the goodness of fit of the observed and expected frequencies, we look at the differences, O – E, shown in the fourth column of Table 13.4. Summing these differences to obtain a measure of goodness of fit isn’t very useful because the sum is 0. Instead, we square each difference (shown in the fifth column) and then divided by the corresponding expected frequency. Doing so gives the values (O – E)^2 / E, called chi-square subtotals, shown in the sixth column. The sum of the chi-square subtotals, 𝛴(O – E)^2 / E = 6.529, is the statistic used to measure the goodness of fit of the observed and expected frequencies.
If the null hypothesis is true, the observed and expected frequencies should be roughly equal, resulting in a small value o the test statistic, 𝛴(O – E)^2 / E. As we have seen, that test statistic is 6.529. Can this value be reasonably attributed to sampling error, or is it large enough to suggest that the null hypothesis is false? To answer this question, we need to know the distribution of the test statistic 𝛴(O – E)^2 / E.