Chi Square

Occasionally, a researcher will collect data that is categorical in nature. For example, suppose we are interested in whether someone attends church, attends football games, or stays at home on Sundays. We can code the data for purposes of inputting the data into the computer, such that the following scheme is used: 1 = attends church, 2 = attends football games, and 3 = stays home. Suppose we want to see if there is any difference between men and women in regards to how they answer this question. What test can we use?

T test? No, because the data is not metric. The dependent variable (Sunday activity) is nominal and the independent variable is a dichotomous dummy coded variable.

Pearson’s R? No, because we are not interested in the relationship between the variables.

The answer is Chi-square analysis. Chi square is the most commonly encountered statistical test for analyzing nominal data. The test is also known as a nonparametric test because it does not rely upon the same assumptions (normality, metric variables, etc.) as many of the other tests we have discovered.

There are 4 requirements for using the Chi square statistical technique:

The samples must have been randomly selected – this means that the researcher has used a commonly accepted sampling method for choosing which subjects to include. Why is this important? Because the goal of research is to infer results back to the population, and the researcher cannot infer back to the general population if the sample is not adequately selected.

The data must be in nominal form – this means that the data is categorical in nature. This is not a problem, however, because nominal is the lowest form of variable scale and therefore any of the other scales cannot be converted into nominal scale data.

There must be independent cell entries – this means that each subject in the study must only be tested on one condition. If this assumption is violated then a manipulation must be made before Chi square can be conducted on data.

No value for expected frequency should be less than 5 – this rule is highly debated among statisticians, with some claiming that is acceptable to use less than 5, while others argue that less than 5 is only acceptable in larger tables (4x5 table). It should be noted that SPSS will alert the user if one of the cell frequencies drops below 5.

How does Chi square work? Chi square analysis works by examining the frequency of scores that occur in the study. The frequency of scores that are expected are compared with the actual frequency of scores that is observed, in order to determine if the frequency of scores observed differ significantly from that which could be observed due to chance.

The first form of Chi square analysis that we will discuss is that of the 1x k Chi Square. Here the researcher is interested in examining the frequency of scores for one group of data.

For example, suppose a researcher is interested in examining whether one radio station is more popular among teenagers. A random sample of 100 teenagers is selected, and they are categorized on the basis of their radio station preference. The data are as follows:

Station A – 40

Station B – 30

Station C – 20

Station D – 10

These values above are representative of the frequency observed. The next step is to then determine the frequency of scores expected. There are two approaches on how to determine the frequency of scores expected.

The frequency expected due to chance – here the researcher will consider the total number of groups and the number of subjects. Then, the researcher will divide the total number of subjects by the number of groups in the study. In the above example there are 100 subjects in the study and there are 4 groups. Therefore, the frequency expected for each group would be 25.

Frequency based upon prior research in the area of the study. For example, suppose prior studies have shown that 35% of all teenagers listen to station A, 25% to station B, 20% to station C, and 20% to station D. The frequency expected for the above example would be 35, 25, 20, 20.

Once the frequency observed and the frequency expected is determined, the researcher must state their hypotheses, which are written as follows:

The calculated Chi square value is then compared to a table of known critical values. If the calculated value is greater than the table value, then the researcher rejects the null hypothesis. However, if the calculated value is less than the table value, then the researcher will accept the null hypothesis.

To calculate a 1 x k Chi square in SPSS:

Open the data file

Analyze

Nonparametric tests

Chi-Square

Place the test variable into the test variable list

If all categories are equal, then merely leave the area next to "all categories equal" marked.

If there is previous research that guides our expected frequencies, then we enter the expected frequencies into the area labeled "values" and ensure that the area next to "values" is marked.

Click OK

The second form of Chi square is referred to as the r x k Chi square (r by k). This form of Chi square is used when a researcher wishes to examine more than one group and then compare these groups with respect to some observed frequency. Much of the calculations remain the same with the exception of how the researcher selects the frequency expected.

To determine the frequency expected for a r x k Chi square, the data is entered into a contingency table. For example, consider the following there are 4 groups in a study, Group A received Vitamin C and Had influenza (frequency of 10), Group B received Vitamin C and did not have influenza (frequency of 20), Group C received placebo and had influenza (frequency of 15), and Group D received placebo and did not have influenza (frequency of 15). The total number of subjects in this study is 60. Therefore to determine the frequency expected for Group A, we take the sum of the Column that A is in (25) and multiply it by the sum of the Row that A is in (30), then it is divided by the total number of subjects. Therefore, (25 x 30)/60 equals 12.50. The procedure would then be repeated for each group.

The hypotheses are written the same as a 1 x k Chi square and the rules for accepting or rejecting a null hypothesis remain the same as well.

To conduct a r x k Chi square on SPSS:

Open the data file

Analyze

Descriptives

Crosstabs

Move the first group into the row section

Move the second group into the column section

Click statistics

Place a check mark by Chi-square

Click continue

Click OK