Correlation Tests

Perhaps the most common kind of experiment in biology looks for a correlation or a relation between two variables. For example is there a correlation between light intensity and transpiration rate, or between pollution and diversity?

The most common test for correlation is the Pearson product-moment correlation, r. This varies from +1 (perfect correlation) through 0 (no correlation) to –1 (perfect negative correlation). In Excel r is calculated using the formula: =CORREL (X range, Y range) . It is usual to draw a scatter graph of the data whenever a correlation is being investigated.

For example the size of breeding pairs of penguins was measured to see if there was correlation between the sizes of the two sexes. The scatter graph and r value of 0.88 clearly indicate a strong positive correlation. In other words large females do pair with large males. Of course this doesn't say why, but it shows there is a correlation to investigate further.

An alternative measure of correlation is the Spearman's rank-order correlation, rs. This is very similar to the Pearson correlation, but it is valid in almost any situation, whereas the Pearson correlation doesn't work unless the data are continuous and normally-distributed. To calculate rs, first make two new columns showing the ranks (or order) of the X and Y data (either by hand or using Excel's =RANK command), and then calculate the Pearson correlation on the rank data. rs tends to be more conservative than r (i.e. it is less likely to show a correlation), but with the penguin data it still indicates a positive (though smaller) correlation.

Beware correlation and relationship:

Just because there is a correlation doesn't necessarily mean there’s a causal relationship. For example there is a correlation between the concentration of lead in the air and low IQ of children but that doesn't mean that lead causes the low IQ: both factors could be due to some other factor.