Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Invariant Consistency for Knowledge Distillation

Jul 16, 2024

Nikolaos Giakoumoglou, Tania Stathaki

Figure 1 for Invariant Consistency for Knowledge Distillation

Figure 2 for Invariant Consistency for Knowledge Distillation

Figure 3 for Invariant Consistency for Knowledge Distillation

Figure 4 for Invariant Consistency for Knowledge Distillation

Share this with someone who'll enjoy it:

Abstract:Knowledge distillation (KD) involves transferring the knowledge from one neural network to another, often from a larger, well-trained model (teacher) to a smaller, more efficient model (student). Traditional KD methods minimize the Kullback-Leibler (KL) divergence between the probabilistic outputs of the teacher and student networks. However, this approach often overlooks crucial structural knowledge embedded within the teacher's network. In this paper, we introduce Invariant Consistency Distillation (ICD), a novel methodology designed to enhance KD by ensuring that the student model's representations are consistent with those of the teacher. Our approach combines contrastive learning with an explicit invariance penalty, capturing significantly more information from the teacher's representation of the data. Our results on CIFAR-100 demonstrate that ICD outperforms traditional KD techniques and surpasses 13 state-of-the-art methods. In some cases, the student model even exceeds the teacher model in terms of accuracy. Furthermore, we successfully transfer our method to other datasets, including Tiny ImageNet and STL-10. The code will be made public soon.

* 7 pages, 1 figure, 3 tables

View paper on

Share this with someone who'll enjoy it:

Title:Invariant Consistency for Knowledge Distillation

Paper and Code