Picture for Haoran You

Haoran You

Celine

Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers

Add code
Dec 22, 2024
Viaarxiv icon

EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting

Add code
Jun 22, 2024
Viaarxiv icon

When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models

Add code
Jun 11, 2024
Viaarxiv icon

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Add code
Jun 11, 2024
Figure 1 for ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Figure 2 for ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Figure 3 for ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Figure 4 for ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Viaarxiv icon

Towards Cognitive AI Systems: a Survey and Prospective on Neuro-Symbolic AI

Add code
Jan 02, 2024
Viaarxiv icon

NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation

Add code
Oct 24, 2023
Viaarxiv icon

NetBooster: Empowering Tiny Deep Learning By Standing on the Shoulders of Deep Giants

Add code
Jun 23, 2023
Viaarxiv icon

ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer

Add code
Jun 10, 2023
Figure 1 for ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Figure 2 for ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Figure 3 for ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Figure 4 for ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Viaarxiv icon

Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design

Add code
Apr 25, 2023
Figure 1 for Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design
Figure 2 for Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design
Figure 3 for Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design
Figure 4 for Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design
Viaarxiv icon

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

Add code
Nov 18, 2022
Figure 1 for Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Figure 2 for Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Figure 3 for Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Figure 4 for Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Viaarxiv icon