Picture for Yerui Sun

Yerui Sun

Scaling Embeddings Outperforms Scaling Experts in Language Models

Add code
Jan 29, 2026
Viaarxiv icon

LongCat-Flash-Thinking-2601 Technical Report

Add code
Jan 23, 2026
Viaarxiv icon

Efficient Context Scaling with LongCat ZigZag Attention

Add code
Dec 30, 2025
Viaarxiv icon

AFA-LoRA: Enabling Non-Linear Adaptations in LoRA with Activation Function Annealing

Add code
Dec 27, 2025
Viaarxiv icon

Accelerate Speculative Decoding with Sparse Computation in Verification

Add code
Dec 26, 2025
Viaarxiv icon

Flash Communication: Reducing Tensor Parallelization Bottleneck for Fast Large Language Model Inference

Add code
Dec 06, 2024
Viaarxiv icon

Integer Scale: A Free Lunch for Faster Fine-grained Quantization of LLMs

Add code
May 23, 2024
Viaarxiv icon

A Speed Odyssey for Deployable Quantization of LLMs

Add code
Nov 16, 2023
Figure 1 for A Speed Odyssey for Deployable Quantization of LLMs
Figure 2 for A Speed Odyssey for Deployable Quantization of LLMs
Figure 3 for A Speed Odyssey for Deployable Quantization of LLMs
Figure 4 for A Speed Odyssey for Deployable Quantization of LLMs
Viaarxiv icon

FPTQ: Fine-grained Post-Training Quantization for Large Language Models

Add code
Aug 30, 2023
Figure 1 for FPTQ: Fine-grained Post-Training Quantization for Large Language Models
Figure 2 for FPTQ: Fine-grained Post-Training Quantization for Large Language Models
Figure 3 for FPTQ: Fine-grained Post-Training Quantization for Large Language Models
Figure 4 for FPTQ: Fine-grained Post-Training Quantization for Large Language Models
Viaarxiv icon