• Question: When do you think a strong correlation can be named for sure between two factors e.g smoking causes lung cancer, especially if there is no technology to prove it?

    Asked by anon-244324 to Ondrej, Jordan, Eleanor, Ed on 19 Mar 2020.
    • Photo: Edward Banks

      Edward Banks answered on 19 Mar 2020:


      This is a good question! There’s a large area of statistics devoted to this sort of idea; at what point can we finally accept something as not happening due to chance? In particle physics we often use the idea of standard deviation. Lots of probability distributions look like bell curves (easier to google what this looks like than for me to explain), and how spread out this curve is is defined by the standard deviation. The more SDs away from the centre you get, the less likely something is to have happened. The Higgs boson for example was not published as definitely found until they had shown that the probability of it being incorrect was in the five SD range; which in terms of probability means about a 1 in 4million chance of getting the same results by coincidence!

    • Photo: Eleanor Jones

      Eleanor Jones answered on 19 Mar 2020:


      As Ed says, statistics plays a huge roll in science. In particle physics, for example, we can never be 100% sure about anything so we have to perform statistical tests in order to get an idea of how certain we are and there a particular limits we have to reach before we term something a ‘discovery’. I would think that the same goes for other areas where we can never be 100% conclusive in something; we then have to define a threshold where we deem something to have strong correlations or strong probabilities of it happening for example.

    • Photo: Ondrej Kovanda

      Ondrej Kovanda answered on 19 Mar 2020:


      This is a very good question. To the first order, the answer is never. Correlation does not imply causality, and there is nothing we can do about it. What we can do, on the other hand, is to quantify how likely we are to be right or wrong. If we observe three cases of lung cancer patients, two of which are strong smokers, can we conclude that it’s likely to be causal? If we keep watching, and count 100 patients, 90 out of them smokers our estimate on the causality might be better, but still an estimate. At the end of the day, it’s up to the person doing the analysis, to set a limit on the parameter describing the correlation (for instance the number of smokers divided by the number of all patients, or something more elaborate), and pronounce the relation as causal once the parameter exceeds this limit. Determining such a limit carries a certain value of confidence, but let’s not over-complicate things. Long story short, correlation and causality are fundamentally different concepts, and claims of causality are and always will be subjective to an extent.

Comments