Image demosaicing is an important step in the image processing pipeline for digital cameras, and it is one of the many tasks within the field of image restoration. A well-known characteristic of natural images is that most patches are smooth, while high-content patches like textures or repetitive patterns are much rarer, which results in a long-tailed distribution. This distribution can create an inductive bias when training machine learning algorithms for image restoration tasks and for image demosaicing in particular. There have been many different approaches to address this challenge, such as utilizing specific losses or designing special network architectures. What makes our work is unique in that it tackles the problem from a training protocol perspective. Our proposed training regime consists of two key steps. The first step is a data-mining stage where sub-categories are created and then refined through an elimination process to only retain the most helpful sub-categories. The second step is a cyclic training process where the neural network is trained on both the mined sub-categories and the original dataset. We have conducted various experiments to demonstrate the effectiveness of our training method for the image demosaicing task. Our results show that this method outperforms standard training across a range of architecture sizes and types, including CNNs and Transformers. Moreover, we are able to achieve state-of-the-art results with a significantly smaller neural network, compared to previous state-of-the-art methods.