Picture for Yuji Chai

Yuji Chai

FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices

Add code
Jan 13, 2025
Figure 1 for FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices
Figure 2 for FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices
Figure 3 for FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices
Figure 4 for FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices
Viaarxiv icon

INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation

Add code
Jun 13, 2023
Viaarxiv icon

PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices

Add code
Jan 26, 2023
Viaarxiv icon

Bigger&Faster: Two-stage Neural Architecture Search for Quantized Transformer Models

Add code
Sep 25, 2022
Figure 1 for Bigger&Faster: Two-stage Neural Architecture Search for Quantized Transformer Models
Figure 2 for Bigger&Faster: Two-stage Neural Architecture Search for Quantized Transformer Models
Figure 3 for Bigger&Faster: Two-stage Neural Architecture Search for Quantized Transformer Models
Figure 4 for Bigger&Faster: Two-stage Neural Architecture Search for Quantized Transformer Models
Viaarxiv icon