Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Multi-level Knowledge Distillation

Dec 01, 2020

Fei Ding, Feng Luo, Hongxin Hu, Yin Yang

Figure 1 for Multi-level Knowledge Distillation

Figure 2 for Multi-level Knowledge Distillation

Figure 3 for Multi-level Knowledge Distillation

Figure 4 for Multi-level Knowledge Distillation

Share this with someone who'll enjoy it:

Abstract:Knowledge distillation has become an important technique for model compression and acceleration. The conventional knowledge distillation approaches aim to transfer knowledge from teacher to student networks by minimizing the KL-divergence between their probabilistic outputs, which only consider the mutual relationship between individual representations of teacher and student networks. Recently, the contrastive loss-based knowledge distillation is proposed to enable a student to learn the instance discriminative knowledge of a teacher by mapping the same image close and different images far away in the representation space. However, all of these methods ignore that the teacher's knowledge is multi-level, e.g., individual, relational and categorical level. These different levels of knowledge cannot be effectively captured by only one kind of supervisory signal. Here, we introduce Multi-level Knowledge Distillation (MLKD) to transfer richer representational knowledge from teacher to student networks. MLKD employs three novel teacher-student similarities: individual similarity, relational similarity, and categorical similarity, to encourage the student network to learn sample-wise, structure-wise and category-wise knowledge in the teacher network. Experiments demonstrate that MLKD outperforms other state-of-the-art methods on both similar-architecture and cross-architecture tasks. We further show that MLKD can improve the transferability of learned representations in the student network.

* 9 pages, 5 tables, 4 figures

View paper on

Share this with someone who'll enjoy it:

Title:Multi-level Knowledge Distillation

Paper and Code