Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuchen Guan

Kendall's $τ$ Coefficient for Logits Distillation

Sep 26, 2024

Yuchen Guan, Runxi Cheng, Kang Liu, Chun Yuan

Figure 1 for Kendall's $τ$ Coefficient for Logits Distillation

Figure 2 for Kendall's $τ$ Coefficient for Logits Distillation

Figure 3 for Kendall's $τ$ Coefficient for Logits Distillation

Figure 4 for Kendall's $τ$ Coefficient for Logits Distillation

Abstract:Knowledge distillation typically employs the Kullback-Leibler (KL) divergence to constrain the student model's output to match the soft labels provided by the teacher model exactly. However, sometimes the optimization direction of the KL divergence loss is not always aligned with the task loss, where a smaller KL divergence could lead to erroneous predictions that diverge from the soft labels. This limitation often results in suboptimal optimization for the student. Moreover, even under temperature scaling, the KL divergence loss function tends to overly focus on the larger-valued channels in the logits, disregarding the rich inter-class information provided by the multitude of smaller-valued channels. This hard constraint proves too challenging for lightweight students, hindering further knowledge distillation. To address this issue, we propose a plug-and-play ranking loss based on Kendall's $\tau$ coefficient, called Rank-Kendall Knowledge Distillation (RKKD). RKKD balances the attention to smaller-valued channels by constraining the order of channel values in student logits, providing more inter-class relational information. The rank constraint on the top-valued channels helps avoid suboptimal traps during optimization. We also discuss different differentiable forms of Kendall's $\tau$ coefficient and demonstrate that the proposed ranking loss function shares a consistent optimization objective with the KL divergence. Extensive experiments on the CIFAR-100 and ImageNet datasets show that our RKKD can enhance the performance of various knowledge distillation baselines and offer broad improvements across multiple teacher-student architecture combinations.

Via

Access Paper or Ask Questions

SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

Aug 17, 2023

Runmin Cong, Yuchen Guan, Jinpeng Chen, Wei Zhang, Yao Zhao, Sam Kwong

Figure 1 for SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

Figure 2 for SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

Figure 3 for SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

Figure 4 for SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

Abstract:Despite significant progress in shadow detection, current methods still struggle with the adverse impact of background color, which may lead to errors when shadows are present on complex backgrounds. Drawing inspiration from the human visual system, we treat the input shadow image as a composition of a background layer and a shadow layer, and design a Style-guided Dual-layer Disentanglement Network (SDDNet) to model these layers independently. To achieve this, we devise a Feature Separation and Recombination (FSR) module that decomposes multi-level features into shadow-related and background-related components by offering specialized supervision for each component, while preserving information integrity and avoiding redundancy through the reconstruction constraint. Moreover, we propose a Shadow Style Filter (SSF) module to guide the feature disentanglement by focusing on style differentiation and uniformization. With these two modules and our overall pipeline, our model effectively minimizes the detrimental effects of background color, yielding superior performance on three public datasets with a real-time inference speed of 32 FPS.

* Accepted by ACM MM 2023

Via

Access Paper or Ask Questions