The use of episodic memories in continual learning is an efficient way to prevent the phenomenon of catastrophic forgetting. In recent studies, several gradient-based approaches have been developed to make more efficient use of compact episodic memories, which constrain the gradients resulting from new samples with gradients from memorized samples. In this paper, we propose a method for decreasing the diversity of gradients through an auxiliary optimization objective that we call Discriminative Representation Loss, instead of directly re-projecting the gradients. Our methods show promising performance with relatively cheap computational cost on several benchmark experiments.