Differently from t-test and ANOVA, which are used to compare the means of population in order to do Hypothesis testing, the Chi-Square Test is used to examine the relationship between categorical variables rather than means. It does so analyzing the frequency in the different categories.
An example could be testing if there is an association between gender (male/female) and preference for a product (like/dislike).
The Chi-Square test hypothesis vary depending on the type of test we are performing:
- Goodness-of-fit test: tests if the observed distribution of a categorical variable matches an expected distribution (example: testing if a die is fair):
- : The observed frequencies. match the expected frequencies.
- Test of independence: tests if there is an association between two observed categorical variables (example: checking if gender is related to voting preferences):
- : The two variables are independent.
The hypothesis is tested producing a chi-square statistic by comparing observed and expected frequencies. An higher indicates that the observed frequencies deviate more from expected frequencies, suggesting a lack of independence (in other words a significant association) or lack of fit.