P-value

From Free net encyclopedia

In statistical hypothesis testing, the p-value of an observed value t_observed of some random variable T used as a test statistic is the probability that, given that the null hypothesis is true, T will assume a value as or more unfavorable to the null hypothesis as the observed value t_observed. "More unfavorable to the null hypothesis" can in some cases mean greater than, in some cases less than, and in some cases further away from a specified center.

In simpler terms, a p-value is the probability of obtaining a finding at least as "impressive" as that obtained, assuming the null hypothesis is true, so that the finding was the result of chance alone. The fact that p-values are based on this assumption is crucial to their correct interpretation.

1 Example
2 Interpretation
3 Frequent misunderstandings
4 Additional reading

[edit]

Example

For example, say an experiment is performed to determine if a coin flip is fair (50% chance of landing heads or tails), or unfairly biased toward heads (> 50% chance of landing heads). The null hypothesis is that the coin is fair, and that any deviations from the 50% rate can be ascribed to chance alone. Suppose that the experimental results show the coin turning up heads 14 times out of 20 total flips. The p-value of this result would be the chance of a fair coin landing on heads at least 14 times out of 20 flips (as larger values in this case are also less favorable to the null hypothesis of a fair coin). The calculated p-value for this is 0.058.

Therefore, in practical terms - The higher the P-value, the higher the probability that the observation(s) you are studying are just chance.

[edit]

Interpretation

Generally, one rejects the null hypothesis if the p-value is smaller than the test level, often represented by the Greek letter <math>\alpha</math> (alpha). If the level is 0.05, then the probability that the p-value is less than 0.05, given that the null hypothesis is true, is 0.05, provided the test statistic has a continuous distribution.

In the above example, the calculated p-value exceeds 0.05, and thus the null hypothesis - that the observed result of 14 heads out of 20 flips can be ascribed to chance alone - is not rejected. Such a finding is often stated as being "not statistically significant at the 5 % level".

However, had a single extra head been obtained, the resulting p-value would be 0.02. This time the null hypothesis - that the observed result of 15 heads out of 20 flips can be ascribed to chance alone - is rejected. Such a finding would be described as being "statistically significant at the 5 % level".

There is often an alternative hypothesis, but the contruction of the test does not allow for 'supporting' a specific alternative.

Critics of p-values point out that the criterion used to decide "statistical significance" is based on the somewhat arbitrary choice of level (Often set at 0.05).

[edit]

Frequent misunderstandings

There are several common misunderstandings about p-values. All of the following numbered statements are false:

The p-value is the probability that the null hypothesis is true, justifying the "rule" of considering as significant p-values closer to 0 (zero).
In fact, frequentist statistics does not, and cannot, attach probabilities to hypotheses. Comparison of Bayesian and classical approaches shows that a p-value can be very close to zero while the posterior probability of the null is very close to unity. This is the Jeffreys-Lindley paradox.
The p-value is the probability that a finding is "merely a fluke" (again, justifying the "rule" of considering small p-values as "significant").
As the calculation of a p-value is based on the assumption that a finding is the product of chance alone, it patently cannot simultaneously be used to gauge the probability of that assumption being true.
The p-value is the probability of falsely rejecting the null hypothesis. This error is a version of the so-called prosecutor's fallacy.
The p-value is the probability that a replicating experiment would not yield the same conclusion.
1-(p-value) is the probability of the alternative hypothesis being true (see (1)).