Hypothesis testing is used to scientifically prove a fact, eliminating any randomness. Example: Is a drug A better than drug B?
- Alternative Hypothesis is a type of Hypothesis testing where you want to prove the hypothesis (example: drug A is more efficient than drug B).
- Null hypothesis is the opposite of Alternative Hypothesis, where you want to prove the inverse of . (example: drug A is not more efficient than drug B). Why do we need null hypothesis? It’s useful if we have an established fact (drug B is the most efficient drug until now), and we want to be the devil’s advocate (drug A is not more efficient). So we perform null testing, and if we arrive to a contradiction, then we can arrive to the fact that drug A is actually more efficient than drug B.
In machine learning, we can use hypothesis testing to prove that a new model B is better than an established model A.
P-value
The p-value measures the probability of obtaining results at least as extreme as the observed results, assuming that the null hypothesis is true.
- Null Hypothesis (H0): This is a statement of no effect or no difference, which you are testing against.
- Alternative Hypothesis (H1): This represents the opposite of the null, indicating that there is an effect or a difference.
Significance of the p-value
- Decision-Making Tool:
- A low p-value (typically ≤ 0.05) suggests that the observed data is unlikely under the null hypothesis, leading researchers to reject the null hypothesis in favor of the alternative hypothesis.
- A high p-value (> 0.05) indicates that the data does not provide enough evidence to reject the null hypothesis.
- Interpretation:
- p ≤ 0.05: Suggests statistical significance; the results are considered unlikely under the null hypothesis.
- p > 0.05: Suggests no statistical significance; the results could be due to chance.
- Limitations:
- The p-value does not measure the size or importance of an effect; it only assesses whether an effect exists.
- Misinterpretation is common; a p-value does not indicate the probability that either hypothesis is true.
Example
Suppose you are testing a new drug’s effectiveness:
- H0: The drug has no effect on patients.
- H1: The drug has a positive effect on patients.
After conducting an experiment, you calculate a p-value of 0.03. Since this is less than 0.05, you would reject the null hypothesis, suggesting that the drug likely has a significant effect.
Methods for Hypothesis testing
There are many techniques to conduct hypothesis testing, like:
- t-test: test mean equality for two populations;
- z-test: similar to t-test, but for a large sample size or when the population standard deviation is know.
- ANOVA - Analysis of Variance: test mean equality for more than two populations;
- Chi-Square Test: test the independence between two categorical variables;
Type I and Type II Errors
Type I Error (False Positive)
- Definition: Rejecting the null hypothesis when it is actually true.
- Consequence: You think there’s an effect or difference when there isn’t.
- Probability: Denoted by α (alpha), typically set at 0.05.
Example: A pregnancy test says you’re pregnant (positive), but you’re not.
Type II Error (False Negative)
- Definition: Failing to reject the null hypothesis when it is actually false.
- Consequence: You miss a real effect or difference.
- Probability: Denoted by β (beta). Power = 1 - β.
Example: A pregnancy test says you’re not pregnant (negative), but you are.
| Error Type | Actual Truth | Decision Made | Description |
|---|---|---|---|
| Type I Error | H₀ is true | Reject H₀ | False Positive |
| Type II Error | H₀ is false | Fail to reject H₀ | False Negative |
statistics resources: