5X5 Episode 46

Skepticism 101 - Correlation and Causation
S: This is the SGU 5X5 and this is the next installment in our on-going series on Skepticism 101, discussing some of the basic concepts in skeptical thinking. This time we're talking about the of confusing.

B: This is also known as ""; the logical fallacy also goes by simply "post hoc" or "false cause" or "coincidental correlation".

R: Pro tip: it actually sounds much more impressive when you use the Latin.

J: I think it's clearer to actually the phrase "correlation does not imply causation."

S: That's not accurate, because it can imply causation; it doesn't equal causation. Or you could say it doesn't necessarily imply causation, right. Sometimes correlation is due to causation, but not always. It's the assumption of causation that is the logical fallacy.

B: And it's also equally invalid to just dismiss the correlation and say, "oh it's just a correlation" when there might be something to that.

S: Right. Yeah. It's equally invalid to say that "correlation is never causation" because it can be. Bob, just to clarify, you said that it goes by the name post hoc ergo propter hoc; that's actually a subset of the correlation fallacy; it's not the same exact thing, 'cause with post hoc, that's A causes B because B follows A.

B: Yeah, it's more of a temporal relationship, right?

S: "After this therefore because this", but correlations are not necessarily temporal. Correlation could just be two variable are varying together. For example, we could say that drunkenness is a variable that follows sunspot activity; these are two variables that... when drunkenness is high, sunspots are high; when sunspots are low, drunkenness is low. That's a correlation. That does not necessarily mean that sunspots cause drunkenness or that drunkenness causes sunspots, which is an even more absurd notion. In fact, when you have a correlation, you always have to decide which among many possible relationships are true. If A correlates with B, then it's possible that A causes B, that B causes A, that both A and B are caused by another variable C or that the correlation is a statistical fluke and therefore not real. Of course, that's always the first thing you have to determine&mdash;is it a real correlation or just a coincidence or a fluke? If it's real, then you've got to decide among those possible causal relationships which one or ones are correct.

J: This is also a very common logical fallacy that is good for people to make themselves familiar with. People will use this in an argument all the time; they're not even aware that they're doing it.

S: It's very very common. It's true.

B: So Steve, you're right; so if you see an apparent correlation, you've got to go through those steps and determine which one&mdash;is A causing B or B causing A, et cetera. So once you've identified all these possibilities, then what you need to do is look at the plausibility&mdash;how plausible are these&mdash;to help you decide, and look for independent evidence. And use those methods to determine what relationship is the most likely, and then there's your answer.

S: Yeah, and you can also... you make a ; each one of those is a hypothesis, and then you make predictions based upon those hypotheses. If A causes B, then we would expect to observe this then let's see if we observe that. Or if B causes A, we would expect a different observation or experimental result. So then you can test the possible causal relationships. And that way you can or line up multiple correlations so that there's only one cause that they have in common. For example, that is the way in which we concluded that certain kinds of lung cancer, because of multiple correlations that all hold up if that causal relationship is true. So, correlations can be a powerful form of evidence and they can lead us to a conclusion about cause, but it's not a direct one-to-one relationship. You have to do further investigation and consider all the possibilities before you come to any causal conclusions.

J: Another good real-world example would be people taking for a cold and saying that it relieves their cold systems faster, or they get over the cold faster. Or another another example would be someone going to a chiropractor for acute back pain and they say that the chiropractic visit actually sped up their recovery.

E: Right. How do they know they wouldn't have gotten better all by themselves, or they did something else that they were maybe not aware of or subconsciously aware of, and that could have been the cause, but they'll tend to correlate the action they took with what made them better.

S: Right, and if you're not controlling for all those other variables, all those other causal relationships, then you won't know. Did you get better because the chiropractic manipulation caused you to get better? Did you get better because you were going to get better on your own? And in fact, you tend to seek treatment when your symptoms are at their worst, so they're almost guaranteed to get better afterwards. Or, did you do three or four things and the manipulation was only one of the things that you did that led to the recovery? Like, perhaps you were also taking it easy or taking a day off from work or getting some moist heat or whatever. Unless you're controlling for all those variables, you can't make a causal conclusion. And that is the key difference between a scientific experiment, where you're trying to control every variable except for the one that you're studying, versus an uncontrolled observation where there's so many variables that you just can't confidently derive any causal conclusion simply based on a correlation, because there are so many possible explanations for any apparent correlation in an uncontrolled setting.