HT05: Are associations/causal links handled correctly?
Part 5: Are associations/causal links handled correctly?
Identifying associations between different variables is essential in research. Researchers often report associations using a statistic called a correlation. The more closely two variables are associated, the larger the correlation. Correlations therefore potentially tell us how two things change in tandem. This is because things that have a causal link (e.g. smoking tobacco cigarettes causes lung cancer) produce correlations (smokers have a higher risk of lung cancer than non-smokers).
A common mistake is to get this procedure back to front, and this is a problem. Causal links produce correlations, but not all correlations are produced by causal links. Here’s a classic example:
This graph shows a correlation between the global average temperature and the approximate number of people engaged in piracy. There appears to be a correlation, but we are right to be skeptical of anyone who proposes a causal link. I hope you’ll agree that encouraging piracy will not decrease global temperatures. You can find many more examples of spurious correlations here.
It’s not always this simple to detect a spurious correlation. For example, let’s say an experimental study compares two groups of patients with an illness – one group takes Drug X, the other takes a placebo. There’s a correlation between patients taking Drug X and improved recovery rates from the illness, but it might not be the case that Drug X is the right treatment for this illness. There may be other physiological factors changing the outcome – perhaps the sample of patients has some sort of bias. This can be quite common, as most study participants are volunteers, and live near the institute carrying out the study. In some parts of the world, this immediately skews the sample towards certain ethnicities and levels of wealth.
So how do we test a correlation? Here’s one way: if you have a hunch that a correlation has a cause, then it’s likely that this cause can produce a different sort of correlation in different circumstances. Say you are a teacher, and you set a test for your students. The students score extremely well but you suspect that they are sharing their answers. To test your hypothesis, you could set each student an individual test. You test your first correlation (students scoring well as a class are sharing their answers) with a second (students that can’t share their answers won’t score as well). This is just one way, and there are many others.
So, when reading an article, ask yourself: have the researchers found a correlation? And if they have, have they proposed a causal link? And do they have any other evidence to back up that their correlation is produced by their cause?