Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiang Luo

Fast and Slow Gradient Approximation for Binary Neural Network Optimization

Dec 16, 2024

Xinquan Chen, Junqi Gao, Biqing Qi, Dong Li, Yiang Luo, Fangyuan Li, Pengfei Li

Figure 1 for Fast and Slow Gradient Approximation for Binary Neural Network Optimization

Figure 2 for Fast and Slow Gradient Approximation for Binary Neural Network Optimization

Figure 3 for Fast and Slow Gradient Approximation for Binary Neural Network Optimization

Figure 4 for Fast and Slow Gradient Approximation for Binary Neural Network Optimization

Abstract:Binary Neural Networks (BNNs) have garnered significant attention due to their immense potential for deployment on edge devices. However, the non-differentiability of the quantization function poses a challenge for the optimization of BNNs, as its derivative cannot be backpropagated. To address this issue, hypernetwork based methods, which utilize neural networks to learn the gradients of non-differentiable quantization functions, have emerged as a promising approach due to their adaptive learning capabilities to reduce estimation errors. However, existing hypernetwork based methods typically rely solely on current gradient information, neglecting the influence of historical gradients. This oversight can lead to accumulated gradient errors when calculating gradient momentum during optimization. To incorporate historical gradient information, we design a Historical Gradient Storage (HGS) module, which models the historical gradient sequence to generate the first-order momentum required for optimization. To further enhance gradient generation in hypernetworks, we propose a Fast and Slow Gradient Generation (FSG) method. Additionally, to produce more precise gradients, we introduce Layer Recognition Embeddings (LRE) into the hypernetwork, facilitating the generation of layer-specific fine gradients. Extensive comparative experiments on the CIFAR-10 and CIFAR-100 datasets demonstrate that our method achieves faster convergence and lower loss values, outperforming existing baselines.Code is available at http://github.com/two-tiger/FSG .

* Accepted to AAAI 2025

Via

Access Paper or Ask Questions

Investigating Deep Watermark Security: An Adversarial Transferability Perspective

Feb 26, 2024

Biqing Qi, Junqi Gao, Yiang Luo, Jianxing Liu, Ligang Wu, Bowen Zhou

Figure 1 for Investigating Deep Watermark Security: An Adversarial Transferability Perspective

Figure 2 for Investigating Deep Watermark Security: An Adversarial Transferability Perspective

Figure 3 for Investigating Deep Watermark Security: An Adversarial Transferability Perspective

Figure 4 for Investigating Deep Watermark Security: An Adversarial Transferability Perspective

Abstract:The rise of generative neural networks has triggered an increased demand for intellectual property (IP) protection in generated content. Deep watermarking techniques, recognized for their flexibility in IP protection, have garnered significant attention. However, the surge in adversarial transferable attacks poses unprecedented challenges to the security of deep watermarking techniques-an area currently lacking systematic investigation. This study fills this gap by introducing two effective transferable attackers to assess the vulnerability of deep watermarks against erasure and tampering risks. Specifically, we initially define the concept of local sample density, utilizing it to deduce theorems on the consistency of model outputs. Upon discovering that perturbing samples towards high sample density regions (HSDR) of the target class enhances targeted adversarial transferability, we propose the Easy Sample Selection (ESS) mechanism and the Easy Sample Matching Attack (ESMA) method. Additionally, we propose the Bottleneck Enhanced Mixup (BEM) that integrates information bottleneck theory to reduce the generator's dependence on irrelevant noise. Experiments show a significant enhancement in the success rate of targeted transfer attacks for both ESMA and BEM-ESMA methods. We further conduct a comprehensive evaluation using ESMA and BEM-ESMA as measurements, considering model architecture and watermark encoding length, and achieve some impressive findings.

* 18 pages, 8 figures

Via

Access Paper or Ask Questions