Missing node attributes is a common problem in real-world graphs. Graph neural networks have been demonstrated powerful in graph representation learning, however, they rely heavily on the completeness of graph information. Few of them consider the incomplete node attributes, which can bring great damage to the performance in practice. In this paper, we propose an innovative node representation learning framework, Wasserstein graph diffusion (WGD), to mitigate the problem. Instead of feature imputation, our method directly learns node representations from the missing-attribute graphs. Specifically, we extend the message passing schema in general graph neural networks to a Wasserstein space derived from the decomposition of attribute matrices. We test WGD in node classification tasks under two settings: missing whole attributes on some nodes and missing only partial attributes on all nodes. In addition, we find WGD is suitable to recover missing values and adapt it to tackle matrix completion problems with graphs of users and items. Experimental results on both tasks demonstrate the superiority of our method.