# p-values: A Legacy of “Scientific Racism”

## A deeper look at the untold history of p-values and its legacy

Raveena Jay

Disclaimer: I’ll be putting quotes around certain words, like race, racial measurements”, etc. because not only are these terms loaded with historical discrimination, but “race” is un-scientific. A “race” of people is an un-scientific term. Also, the reader should have a bit of familiarity with p-values, hypothesis testing, and Bayes’ theorem.

It’s easy to think of science as objective. Science, after all, is about studying and recording what occurs in the natural world, right? The natural world is “beyond” the human realm, beyond our imagining and our minds, as science classes would teach.

But we should not get lost on who is doing the science: sure, a microscope will help you examine smaller objects and organisms more closely, and yes, using an X-ray telescope can display certain characteristics of a galaxy that might not be obvious in visible light — but in the end, we humans are doing the observing. And we are doing the interpreting of the data.

Before we delve into the history of p-values, and possible alternative solutions to the value (and their shortcomings as well), let’s briefly go over the definition of it.

In statistics, the definition is: The p-value of a hypothesis test is the probability of your test statistic taking the observed value or more extreme, assuming your null hypothesis is true. In mathematical notation, this would be

The vertical line means “given H0”, which means “assuming the null hypothesis is true”. The value x would be, say a “z-score”, or a “t-statistic”, or a “chi-squared statistic”— if those are familiar words from your statistics classes.

One of the most common mistakes is assuming that the p-value tells you the probability of your null hypothesis given the data-evidence. This is wrong. The mathematical notation for this would be:

These two formulas are different — but it’s easy to confuse the two. Usually, after doing experiments, scientists want the second formula but often get stuck with the first. Later we’ll see how we could get from one to the other, and how that connects to alternative ways to measure statistical results of experiments.

Wikipedia has a nice illustration of this common error: