The accuracy of solving partial differential equations (PDEs) on coarse grids is greatly affected by the choice of discretization schemes. In this work, we propose to learn time integration schemes based on neural networks which satisfy three distinct sets of mathematical constraints, i.e., unconstrained, semi-constrained with the root condition, and fully-constrained with both root and consistency conditions. We focus on the learning of 3-step linear multistep methods, which we subsequently applied to solve three model PDEs, i.e., the one-dimensional heat equation, the one-dimensional wave equation, and the one-dimensional Burgers' equation. The results show that the prediction error of the learned fully-constrained scheme is close to that of the Runge-Kutta method and Adams-Bashforth method. Compared to the traditional methods, the learned unconstrained and semi-constrained schemes significantly reduce the prediction error on coarse grids. On a grid that is 4 times coarser than the reference grid, the mean square error shows a reduction of up to an order of magnitude for some of the heat equation cases, and a substantial improvement in phase prediction for the wave equation. On a 32 times coarser grid, the mean square error for the Burgers' equation can be reduced by up to 35% to 40%.