Picture for Yongin Kwon

Yongin Kwon

A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B

Add code
Sep 17, 2024
Figure 1 for A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B
Figure 2 for A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B
Figure 3 for A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B
Figure 4 for A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B
Viaarxiv icon

Mixed Non-linear Quantization for Vision Transformers

Add code
Jul 26, 2024
Viaarxiv icon

LLMem: Estimating GPU Memory Usage for Fine-Tuning Pre-Trained LLMs

Add code
Apr 16, 2024
Viaarxiv icon

Visual Preference Inference: An Image Sequence-Based Preference Reasoning in Tabletop Object Manipulation

Add code
Mar 18, 2024
Viaarxiv icon

Tensor Slicing and Optimization for Multicore NPUs

Add code
Apr 06, 2023
Viaarxiv icon

Q-HyViT: Post-Training Quantization for Hybrid Vision Transformer with Bridge Block Reconstruction

Add code
Mar 22, 2023
Viaarxiv icon

CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution

Add code
Jul 04, 2022
Figure 1 for CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
Figure 2 for CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
Figure 3 for CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
Figure 4 for CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
Viaarxiv icon

Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment

Add code
Feb 21, 2022
Figure 1 for Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Figure 2 for Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Figure 3 for Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Figure 4 for Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Viaarxiv icon