W_ord2vec_ solves the problem of sparsity since it uses a Neural Network to produce a dense vector for each word in a document.
Furthermore, the vector that represents the word will keep some algebraic properties, for example .
The catch is that we have to aggregate all those vectors in order to represent a document. A basic idea is just to use the sum, the average or other aggregation operations.
A better idea is to multiply each word vector for its TF-IDF score, and then sum them all, in order to have a more well weighted final document vector.