It’s been said that statistics can be used to prove just about anything. Take, for example, one study that I recently read about, which examined the link between vegetarianism among pregnant women and an increased risk of drug and alcohol abuse among their children. The study examined over 5,000 women and their children, and finding that if their mothers ate little to no meat while pregnant, then the children were more likely to drink, smoke, and do drugs at 15. It’s an interesting study, but at the same time, it’s one that could be part of a phenomenon that’s tragically common in the field of science, and is often used to push an agenda at the cost of objectivity. I’m talking, of course, about “p-hacking”.
In this phenomenon, “p” is the value used to determine statistical significance. Ultimately, a difference between two groups is only meaningful if it’s statistically significant. Let’s say a “p” value is less than .001: this means that there’s a less than 0.1% chance that an observed finding is due to chance, and more than a 99% chance that it represents a real difference. This means that a statistically significant result is almost certainly real. However, large datasets may involve countless variables. If mining a dataset of 10,000 possible variables, for example, 10 statistically significant results should be treated as coincidence.
Ultimately, p-hacking is a constant possibility when analyzing large datasets, so that treating every statistically significant result as real is dangerous. Luckily, there are plenty of statistical tests to give a greater insight into whether or not a result is real. The basic one is reproducibility. If other data sets produce the same results, then you might just be onto something. But until that happens, there’s no reason to believe that results are real. Plenty of times, however, mining large datasets is used to find a point and push an agenda. In the case of this study, eating meat. Yet until that study is duplicated, you need to treat it with a grain of salt.