Everyday robotics are challenged to deal with autonomous product handling in applications like logistics or retail, possibly causing damage on the items during manipulation. Traditionally, most approaches try to minimize physical interaction with goods. However, we propose to take into account any unintended motion of objects in the scene and to learn manipulation strategies in a self-supervised way which minimize the potential damage. The presented approach consists of a planning method that determines the optimal sequence to manipulate a number of objects in a scene with respect to possible damage by simulating interaction and hence anticipating scene dynamics. The planned manipulation sequences are taken as input to a machine learning process which generalizes to new, unseen scenes in the same application scenario. This learned manipulation strategy is continuously refined in a self-supervised optimization cycle dur- ing load-free times of the system. Such a simulation-in-the-loop setup is commonly known as mental simulation and allows for efficient, fully automatic generation of training data as opposed to classical supervised learning approaches. In parallel, the generated manipulation strategies can be deployed in near-real time in an anytime fashion. We evaluate our approach on one industrial scenario (autonomous container unloading) and one retail scenario (autonomous shelf replenishment).