Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tanapat Ratchatorn

Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization

Jun 20, 2024

Tanapat Ratchatorn, Masayuki Tanaka

Figure 1 for Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization

Figure 2 for Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization

Figure 3 for Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization

Figure 4 for Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization

Abstract:Recent advancements in learning algorithms have demonstrated that the sharpness of the loss surface is an effective measure for improving the generalization gap. Building upon this concept, Sharpness-Aware Minimization (SAM) was proposed to enhance model generalization and achieved state-of-the-art performance. SAM consists of two main steps, the weight perturbation step and the weight updating step. However, the perturbation in SAM is determined by only the gradient of the training loss, or cross-entropy loss. As the model approaches a stationary point, this gradient becomes small and oscillates, leading to inconsistent perturbation directions and also has a chance of diminishing the gradient. Our research introduces an innovative approach to further enhancing model generalization. We propose the Adaptive Adversarial Cross-Entropy (AACE) loss function to replace standard cross-entropy loss for SAM's perturbation. AACE loss and its gradient uniquely increase as the model nears convergence, ensuring consistent perturbation direction and addressing the gradient diminishing issue. Additionally, a novel perturbation-generating function utilizing AACE loss without normalization is proposed, enhancing the model's exploratory capabilities in near-optimum stages. Empirical testing confirms the effectiveness of AACE, with experiments demonstrating improved performance in image classification tasks using Wide ResNet and PyramidNet across various datasets. The reproduction code is available online

* Accepted in ICIP2024. The project page can be accessed at http://www.vip.sc.e.titech.ac.jp/proj/AACE

Via

Access Paper or Ask Questions