Point clouds are unstructured and unordered in the embedded 3D space. In order to produce consistent responses under different permutation layouts, most existing methods aggregate local spatial points through maximum or summation operation. But such an aggregation essentially belongs to the isotropic filtering on all operated points therein, which tends to lose the information of geometric structures. In this paper, we propose a spatial transformer point convolution (STPC) method to achieve anisotropic convolution filtering on point clouds. To capture and represent implicit geometric structures, we specifically introduce spatial direction dictionary to learn those latent geometric components. To better encode unordered neighbor points, we design sparse deformer to transform them into the canonical ordered dictionary space by using direction dictionary learning. In the transformed space, the standard image-like convolution can be leveraged to generate anisotropic filtering, which is more robust to express those finer variances of local regions. Dictionary learning and encoding processes are encapsulated into a network module and jointly learnt in an end-to-end manner. Extensive experiments on several public datasets (including S3DIS, Semantic3D, SemanticKITTI) demonstrate the effectiveness of our proposed method in point clouds semantic segmentation task.