The covariance between two variables and is a factor that tells us how (but not how much) these variables go together. It tells us the direction of the trend. Covariance is very similar to correlation, which is just a normalized covariance. Furthermore, correlation is a function of the covariance.
If we have a two dimensional dataset (each point has two coordinates), we can compute the variance on the and axis.
Having the two independent variances doesn’t tell us much about the trend of the dataset, that’s why we need the covariance between and . We can compute it like this:
We are basically taking the expected value of the product between the differences of X and its mean, and Y and its mean.
A positive covariance mean that we have an increasing trend, while a negative value means that we have a decreasing trend (as X decreases, Y decreases).
Note
Covariance quantifies the direction of the trend, but doesn’t say how strong the trend is.
In order to know how strong the trend is, we need to use a Correlation coefficient, which is not subject to the scale of the variable, since it’s normalized.
Note
Covariance has the commutative property, so
Related:
statistics resources: