Docking is an important tool in computational drug discovery that aims to predict the binding pose of a ligand to a target protein through a combination of pose scoring and optimization. A scoring function that is differentiable with respect to atom positions can be used for both scoring and gradient-based optimization of poses for docking. Using a differentiable grid-based atomic representation as input, we demonstrate that a scoring function learned by training a convolutional neural network (CNN) to identify binding poses can also be applied to pose optimization. We also show that an iteratively-trained CNN that includes poses optimized by the first CNN in its training set performs even better at optimizing randomly initialized poses than either the first CNN scoring function or AutoDock Vina.