Statistics Dictionary

To see a definition, select a term from the dropdown text box below. The statistics dictionary will display the definition, plus links to related web pages.

Select term:

Chi-Square Test for Homogeneity

The chi-square test of homogeneity is applied to a single categorical variable . It is used to compare the distribution of frequency counts across different populations. It answers the following question: Are frequency counts distributed identically across different populations?

The test procedure is appropriate when the following conditions are met:

  • For each population, the sampling method is simple random sampling .
  • The population is at least 10 times as large as the sample.
  • The variable under study is categorical .
  • If sample data are displayed in a contingency table (Populations x Category levels), the expected frequency count for each cell of the table is at least 5.

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

  • State the hypothesis. Every hypothesis test requires a null hypothesis and an alternative hypothesis. Suppose that data were sampled from r populations, and assume that the categorical variable had c levels. At any specified level of the categorical variable, the null hypothesis states that each population has the same proportion of observations. Thus,

    H0: Plevel 1 of population 1 = Plevel 1 of population 2 = . . . = Plevel 1 of population r
    H0: Plevel 2 of population 1 = Plevel 2 of population 2 = . . . = Plevel 2 of population r
    . . .
    H0: Plevel c of population 1 = Plevel c of population 2 = . . . = Plevel c of population r

    The alternative hypothesis (Ha) is that at least one of the null hypothesis statements is false.

  • Formulate an analysis plan. The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the significance level and the test method (i.e., the chi-square test of homogeneity).

  • Analyze sample data. Using sample data from the contingency tables, find the degrees of freedom, expected frequency counts, test statistic, and the P-value associated with the test statistic. The analysis described in this section is illustrated in the sample problem at the end of this lesson.

    • Degrees of freedom. The degrees of freedom (DF) is equal to:

      DF = (r - 1) * (c - 1)

      where r is the number of populations, and c is the number of levels for the categorical variable.

    • Expected frequency counts. The expected frequency counts are computed separately for each population at each level of the categorical variable, according to the following formula.

      Er,c = (nr * nc) / n

      where Er,c is the expected frequency count for population r at level c of the categorical variable, nr is the total number of observations from population r, nc is the total number of observations at treatment level c, and n is the total sample size.

    • Test statistic. The test statistic is a chi-square random variable (Χ2) defined by the following equation.

      Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ]

      where Or,c is the observed frequency count in population r for level c of the categorical variable, and Er,c is the expected frequency count in population r for level c of the categorical variable.

    • P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution Calculator to assess the probability associated with the test statistic. Use the degrees of freedom computed above.

  • Interpret results. If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level , and rejecting the null hypothesis when the P-value is less than the significance level.

See also:   Chi-Square Test for Homogeneity