The miniaturization and mobility of computer vision systems are limited by the heavy computational burden and the size of optical lenses. Here, we propose to use a ultra-thin diffractive optical element to implement passive optical convolution. A division adjoint opto-electronic co-design method is also proposed. In our simulation experiments, the first few convolutional layers of the neural network can be replaced by optical convolution in a classification task on the CIFAR-10 dataset with no power consumption, while similar performance can be obtained.