Picture for Kaiqi Zhang

Kaiqi Zhang

KOALA: Enhancing Speculative Decoding for LLM via Multi-Layer Draft Heads with Adversarial Learning

Add code
Aug 15, 2024
Viaarxiv icon

Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

Add code
Jun 10, 2024
Viaarxiv icon

Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks

Add code
Jul 04, 2023
Viaarxiv icon

Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks

Add code
Jun 13, 2022
Figure 1 for Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Figure 2 for Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Figure 3 for Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Figure 4 for Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Viaarxiv icon

Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?

Add code
Apr 21, 2022
Figure 1 for Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?
Figure 2 for Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?
Figure 3 for Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?
Figure 4 for Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?
Viaarxiv icon

3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration

Add code
May 11, 2021
Figure 1 for 3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration
Figure 2 for 3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration
Figure 3 for 3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration
Figure 4 for 3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration
Viaarxiv icon

Active Subspace of Neural Networks: Structural Analysis and Universal Attacks

Add code
Oct 29, 2019
Figure 1 for Active Subspace of Neural Networks: Structural Analysis and Universal Attacks
Figure 2 for Active Subspace of Neural Networks: Structural Analysis and Universal Attacks
Figure 3 for Active Subspace of Neural Networks: Structural Analysis and Universal Attacks
Figure 4 for Active Subspace of Neural Networks: Structural Analysis and Universal Attacks
Viaarxiv icon

A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM

Add code
Nov 05, 2018
Figure 1 for A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM
Figure 2 for A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM
Figure 3 for A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM
Figure 4 for A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM
Viaarxiv icon

Progressive Weight Pruning of Deep Neural Networks using ADMM

Add code
Nov 04, 2018
Figure 1 for Progressive Weight Pruning of Deep Neural Networks using ADMM
Figure 2 for Progressive Weight Pruning of Deep Neural Networks using ADMM
Figure 3 for Progressive Weight Pruning of Deep Neural Networks using ADMM
Figure 4 for Progressive Weight Pruning of Deep Neural Networks using ADMM
Viaarxiv icon

ADAM-ADMM: A Unified, Systematic Framework of Structured Weight Pruning for DNNs

Add code
Jul 29, 2018
Figure 1 for ADAM-ADMM: A Unified, Systematic Framework of Structured Weight Pruning for DNNs
Figure 2 for ADAM-ADMM: A Unified, Systematic Framework of Structured Weight Pruning for DNNs
Figure 3 for ADAM-ADMM: A Unified, Systematic Framework of Structured Weight Pruning for DNNs
Figure 4 for ADAM-ADMM: A Unified, Systematic Framework of Structured Weight Pruning for DNNs
Viaarxiv icon