The Real Meaning of "Due to Chance"
Sometimes, people have become so enamored with statistical methods they have hypostatized the terms used in such analysis, and have taken to treating ideas like "chance" or "regression to the mean" as if they could be the actual causes of events in the real world.
The analysis of probability distributions arose largely in the context of dealing with errors in scientific measurements. Ten astronomers all measured the position of Mercury in the sky at a certain time on a particular evening, and got ten different results. What should we make of this mess?
It was a true breakthrough to analyze such errors as though they were results in a game of chance, and to realize that averaging all the measurements was a better way to approach the true figure than was asking "Which measurement was the most reliable?"
This breakthrough involved regarding the measurement error in a population of measurements as being randomly distributed around the true value that a perfect measurement would have reported. The errors were "due to chance." And also, we could perform a statistical test to see which deviations from the perfect measurement were most likely not due to chance, and perhaps were the result of something like a deliberate attempt to fix the outcome of a test.
The phrase "due to chance" is just fine in the context of this statistical analysis: it means something like "We don't detect any causal factor so dominant in what we are analyzing that we should single it out as the cause of what occurred." But what it does not mean is that a causal agent called "chance" produced the result! No, it means that a large number of causal factors were at work, and that there is no way our test can isolate one in particular as "causing" the outcome.
In the context of measurement error, the fact that Johnson's measurement differed from Smith's, and from Davidson's, was caused by Smith's shaky hands, and Johnson having a smudge on his glasses, and the wind being high at the place Davidson was working, and Smith having slightly mis-calibrated his measuring device, and Johnson being distracted by a phone call, and Davidson misreading his device, and... so on and so on. So long as lots of causal factors influence each measurement, and none of them dominate the outcome of the measurement, we can treat their interplay as if some factor called "chance" were at play: but there is no such actual factor!
The analysis of probability distributions arose largely in the context of dealing with errors in scientific measurements. Ten astronomers all measured the position of Mercury in the sky at a certain time on a particular evening, and got ten different results. What should we make of this mess?
It was a true breakthrough to analyze such errors as though they were results in a game of chance, and to realize that averaging all the measurements was a better way to approach the true figure than was asking "Which measurement was the most reliable?"
This breakthrough involved regarding the measurement error in a population of measurements as being randomly distributed around the true value that a perfect measurement would have reported. The errors were "due to chance." And also, we could perform a statistical test to see which deviations from the perfect measurement were most likely not due to chance, and perhaps were the result of something like a deliberate attempt to fix the outcome of a test.
The phrase "due to chance" is just fine in the context of this statistical analysis: it means something like "We don't detect any causal factor so dominant in what we are analyzing that we should single it out as the cause of what occurred." But what it does not mean is that a causal agent called "chance" produced the result! No, it means that a large number of causal factors were at work, and that there is no way our test can isolate one in particular as "causing" the outcome.
In the context of measurement error, the fact that Johnson's measurement differed from Smith's, and from Davidson's, was caused by Smith's shaky hands, and Johnson having a smudge on his glasses, and the wind being high at the place Davidson was working, and Smith having slightly mis-calibrated his measuring device, and Johnson being distracted by a phone call, and Davidson misreading his device, and... so on and so on. So long as lots of causal factors influence each measurement, and none of them dominate the outcome of the measurement, we can treat their interplay as if some factor called "chance" were at play: but there is no such actual factor!
Gene,
ReplyDeleteI work as a measurement engineer, and this is exactly how I deal with chance, randomness, what we call "uncertainty." Among other responsibilities, I create "MUAs", measurement uncertainty analyses, for our critical measurements.