Hypothesis testing

Hypothesis testing is used to scientifically prove a fact, eliminating any randomness. Example: Is a drug A better than drug B?

Alternative Hypothesis $H_{a}$ is a type of Hypothesis testing where you want to prove the hypothesis $a$ (example: drug A is more efficient than drug B).
Null hypothesis $H_{0}$ is the opposite of Alternative Hypothesis, where you want to prove the inverse of $a$ . (example: drug A is not more efficient than drug B). Why do we need null hypothesis? It’s useful if we have an established fact (drug B is the most efficient drug until now), and we want to be the devil’s advocate (drug A is not more efficient). So we perform null testing, and if we arrive to a contradiction, then we can arrive to the fact that drug A is actually more efficient than drug B.

In machine learning, we can use hypothesis testing to prove that a new model B is better than an established model A.

P-value

The p-value measures the probability of obtaining results at least as extreme as the observed results, assuming that the null hypothesis is true.

Null Hypothesis (H0): This is a statement of no effect or no difference, which you are testing against.
Alternative Hypothesis (H1): This represents the opposite of the null, indicating that there is an effect or a difference.

Significance of the p-value

Decision-Making Tool:
- A low p-value (typically ≤ 0.05) suggests that the observed data is unlikely under the null hypothesis, leading researchers to reject the null hypothesis in favor of the alternative hypothesis.
- A high p-value (> 0.05) indicates that the data does not provide enough evidence to reject the null hypothesis.
Interpretation:
- p ≤ 0.05: Suggests statistical significance; the results are considered unlikely under the null hypothesis.
- p > 0.05: Suggests no statistical significance; the results could be due to chance.
Limitations:
- The p-value does not measure the size or importance of an effect; it only assesses whether an effect exists.
- Misinterpretation is common; a p-value does not indicate the probability that either hypothesis is true.

Example

Suppose you are testing a new drug’s effectiveness:

H0: The drug has no effect on patients.
H1: The drug has a positive effect on patients.

After conducting an experiment, you calculate a p-value of 0.03. Since this is less than 0.05, you would reject the null hypothesis, suggesting that the drug likely has a significant effect.

Methods for Hypothesis testing

There are many techniques to conduct hypothesis testing, like:

t-test: test mean equality for two populations;
z-test: similar to t-test, but for a large sample size or when the population standard deviation is know.
ANOVA - Analysis of Variance: test mean equality for more than two populations;
Chi-Square Test: test the independence between two categorical variables;

Type I and Type II Errors

Type I Error (False Positive)

Definition: Rejecting the null hypothesis when it is actually true.
Consequence: You think there’s an effect or difference when there isn’t.
Probability: Denoted by α (alpha), typically set at 0.05.

Example: A pregnancy test says you’re pregnant (positive), but you’re not.

Type II Error (False Negative)

Definition: Failing to reject the null hypothesis when it is actually false.
Consequence: You miss a real effect or difference.
Probability: Denoted by β (beta). Power = 1 - β.

Example: A pregnancy test says you’re not pregnant (negative), but you are.

Error Type	Actual Truth	Decision Made	Description
Type I Error	H₀ is true	Reject H₀	False Positive
Type II Error	H₀ is false	Fail to reject H₀	False Negative

statistics resources:

What is Hypothesis Testing ? Math, Statistics for data science, machine learning - YouTube

Quartz 4

Explorer