Blog
Statistics Cartoon Comics: Series 3
Statistics Cartoon Comics: Series 2
p-value What it is and What it is not !!!
What Are p-Values?
p-values evaluate how well the sample data support the devil's advocate argument that the null hypothesis is true. It measures how compatible your data are with the null hypothesis. How likely is the effect observed in your sample data if the null hypothesis is true?
High p values: your data are likely with a true null.
Low p values: your data are unlikely with a true null.
A low p value suggests that your sample provides enough evidence that you can reject the null hypothesis for the entire population.
How Do You Interpret p-Values?
In technical terms, a p-value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis.
For example, suppose that a vaccine study produced a p-value of 0.04. This p-value indicates that if the vaccine had no effect, you'd obtain the observed difference or more in 4% of studies due to random sampling error.
p-values address only one question: how likely are your data, assuming a true null hypothesis? It does not measure support for the alternative hypothesis. This limitation leads us into to a very common misinterpretation of p-values.
p-Values Are NOT the Probability of Making a Mistake
Incorrect interpretations of p-values are very common. The most common mistake is to interpret a p-value as the probability of making a mistake by rejecting a true null hypothesis (a Type I error).
There are several reasons why p-values can't be the error rate.
First, p-values are calculated based on the assumptions that the null is true for the population and that the difference in the sample is caused entirely by random chance. Consequently, p-values can't tell you the probability that the null is true or false because it is 100% true from the perspective of the calculations.
Second, while a low p-value indicates that your data are unlikely assuming a true null, it can't evaluate which of two competing cases is more likely:
a) The null is true but your sample was unusual.
b) The null is false.
Determining which case is more likely requires subject area knowledge and replicate studies.
Let's go back to the vaccine study and compare the correct and incorrect way to interpret the p-value of 0.04:
Correct: Assuming that the vaccine had no effect, you'd obtain the observed difference or more in 4% of studies due to random sampling error.
Incorrect: If you reject the null hypothesis, there's a 4% chance that you're making a mistake.
Courtesy: Various web sources