Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:What Knowledge Gets Distilled in Knowledge Distillation?

May 31, 2022

Utkarsh Ojha, Yuheng Li, Yong Jae Lee

Figure 1 for What Knowledge Gets Distilled in Knowledge Distillation?

Figure 2 for What Knowledge Gets Distilled in Knowledge Distillation?

Figure 3 for What Knowledge Gets Distilled in Knowledge Distillation?

Figure 4 for What Knowledge Gets Distilled in Knowledge Distillation?

Share this with someone who'll enjoy it:

Abstract:Knowledge distillation aims to transfer useful information from a teacher network to a student network, with the primary goal of improving the student's performance for the task at hand. Over the years, there has a been a deluge of novel techniques and use cases of knowledge distillation. Yet, despite the various improvements, there seems to be a glaring gap in the community's fundamental understanding of the process. Specifically, what is the knowledge that gets distilled in knowledge distillation? In other words, in what ways does the student become similar to the teacher? Does it start to localize objects in the same way? Does it get fooled by the same adversarial samples? Does its data invariance properties become similar? Our work presents a comprehensive study to try to answer these questions and more. Our results, using image classification as a case study and three state-of-the-art knowledge distillation techniques, show that knowledge distillation methods can indeed indirectly distill other kinds of properties beyond improving task performance. By exploring these questions, we hope for our work to provide a clearer picture of what happens during knowledge distillation.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:What Knowledge Gets Distilled in Knowledge Distillation?

Paper and Code