In this paper, we present a novel approach to automatic 3D Facial Expression Recognition (FER) based on deep representation of facial 3D geometric and 2D photometric attributes. A 3D face is firstly represented by its geometric and photometric attributes, including the geometry map, normal maps, normalized curvature map and texture map. These maps are then fed into a pre-trained deep convolutional neural network to generate the deep representation. Then the facial expression prediction is simplyachieved by training linear SVMs over the deep representation for different maps and fusing these SVM scores. The visualizations show that the deep representation provides a complete and highly discriminative coding scheme for 3D faces. Comprehensive experiments on the BU-3DFE database demonstrate that the proposed deep representation can outperform the widely used hand-crafted descriptors (i.e., LBP, SIFT, HOG, Gabor) and the state-of-art approaches under the same experimental protocols.