Covariance

The covariance between two variables $x$ and $y$ is a factor that tells us how (but not how much) these variables go together. It tells us the direction of the trend. Covariance is very similar to correlation, which is just a normalized covariance. Furthermore, correlation is a function of the covariance.

If we have a two dimensional dataset (each point has two coordinates), we can compute the variance on the $x$ and $y$ axis.

Having the two independent variances doesn’t tell us much about the trend of the dataset, that’s why we need the covariance between $x$ and $y$ . We can compute it like this:

Cov (X, Y) = E [(X - μ_{x}) (Y - μ_{y})] \in (- \infty, + \infty)

We are basically taking the expected value of the product between the differences of X and its mean, and Y and its mean.

A positive covariance mean that we have an increasing trend, while a negative value means that we have a decreasing trend (as X decreases, Y decreases).

Note

Covariance quantifies the direction of the trend, but doesn’t say how strong the trend is.

In order to know how strong the trend is, we need to use a Correlation coefficient, which is not subject to the scale of the variable, since it’s normalized.

Note

Covariance has the commutative property, so $Cov (x, y) = Cov (y, x)$

Covariance Matrix

statistics resources:

Covariance Clearly Explained! - YouTube

Quartz 4

Explorer

Covariance

Graph View

Backlinks