Grasping and releasing objects would cause oscillations to delivery drones in the warehouse. To reduce such undesired oscillations, this paper treats the to-be-delivered object as an unknown external disturbance and presents an image-based disturbance observer (DOB) to estimate and reject such disturbance. Different from the existing DOB technique that can only compensate for the disturbance after the oscillations happen, the proposed image-based one incorporates image-based disturbance prediction into the control loop to further improve the performance of the DOB. The proposed image-based DOB consists of two parts. The first one is deep-learning-based disturbance prediction. By taking an image of the to-be-delivered object, a sequential disturbance signal is predicted in advance using a connected pre-trained convolutional neural network (CNN) and a long short-term memory (LSTM) network. The second part is a conventional DOB in the feedback loop with a feedforward correction, which utilizes the deep learning prediction to generate a learning signal. Numerical studies are performed to validate the proposed image-based DOB regarding oscillation reduction for delivery drones during the grasping and releasing periods of the objects.