Correlation analysis involves examining the relationship between two variables. The presence of correlation should not be interpreted as meaning causation. Two variables can be highly correlated and not be related in a cause and effect manner. For example, one could take the number of ice cream cones sold in the city and correlate this with the number of drowning deaths. If there is a direct correlation discovered, does this mean that drowning deaths could be reduced by cutting down the number of ice cream cones sold? Of course not, there must be some other variable that could cause the increase in drowning deaths.
When examining cause and effect, it is important to consider three issues:
Considering all of this, the question may arise as to why someone would bother with correlation analysis. The answer is that correlation analysis can allow the researcher to yield better-than-chance predictions about relationships. If we know that events are correlated and we can determine the direction of the relationship then we can possibly control future events. For Example: We discover that Hours of Studying and Test Score are related. Further, as Hours of Studying increases, Test Scores increase. Knowing this we can increase our hours of studying to improve our grades.
There are three general forms of correlation:
How to Determine the Form of Correlation
Correlation is obtained through the examination of a correlation value. When the variables are metric, correlation is examined through the value for Pearsonís r. Values for Pearsonís r range from Ė1.00 to +1.00, with values of +1.00 indicating a perfect correlation. For example, if studying and test scores were correlated at +1.00, then increases in studying hours would always result in increased test scores. It is extremely rare to find a perfect correlation. Inversely, a value of Ė1.00 would indicate that increases in studying hours would always result in decreases in test scores. A value of zero means that there is no correlation.
Strength of correlation increases as the value for Pearsonís r approaches 1.00, regardless of whether the sign is positive or negative. The strength of correlation is as follows:
If the correlation is positive, the relationship is said to be direct. In this case, as one variable increases the other variable will increase as well.
If the correlation is negative, the relationship is said to be inverse. In this case, as one variable increases the other variable will decrease.
Once a value for Pearsonís r is calculated, the next step is to determine the significance of the Pearsonís r value. This testing is done to determine whether the correlation can be inferred back to the general population. When testing for the significance of the Pearsonís r value we are testing the null hypothesis that there is no statistically significant relationship between the two variables and that any perceived relationship is due to chance or sampling error. The null hypothesis is written as follows:
The value for calculated Pearsonís r is compared with a table of known critical values and if the calculated value is greater than the table value then the null hypothesis is rejected. If the calculated value is less than the table value then the null hypothesis is rejected (or as some prefer to indicate: we fail to reject the null hypothesis).
There are 5 requirements to using Pearsonís r:
If the measurements are not at least interval data, and are instead ordinal data (or dichotomous dummy coded variables) then the statistical technique of choice is Spearmanís rho. Spearmanís rho is a nonparametric statistical technique, meaning that the statistical technique does not require normal distributions or data that is in interval or ratio scale.
Spearmanís rho is interpreted in much the same manner as Pearsonís r, and the significance of the correlation is expressed in the same manner. The null hypothesis is that there is no relationship between the variables and that any perceived relationship is due to chance or sampling error. The hypotheses are written as follows:
A correlation value is calculated and then compared to a table of critical values. If the calculated value is greater than the table value, then the null hypothesis is rejected. If the calculated value is less than the table value, then the null hypothesis is accepted.
To calculate correlation between variables in SPSS:
Open a data file
Move the variables of interest into the "variables" box