Picture for Minjia Zhang

Minjia Zhang

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Add code
Dec 30, 2024
Viaarxiv icon

MiniKV: Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache

Add code
Nov 28, 2024
Figure 1 for MiniKV: Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache
Figure 2 for MiniKV: Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache
Figure 3 for MiniKV: Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache
Figure 4 for MiniKV: Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache
Viaarxiv icon

Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache

Add code
Nov 27, 2024
Figure 1 for Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache
Figure 2 for Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache
Figure 3 for Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache
Figure 4 for Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache
Viaarxiv icon

Transforming the Hybrid Cloud for Emerging AI Workloads

Add code
Nov 20, 2024
Viaarxiv icon

Stochastic Monkeys at Play: Random Augmentations Cheaply Break LLM Safety Alignment

Add code
Nov 05, 2024
Viaarxiv icon

Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions

Add code
Aug 01, 2024
Figure 1 for Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions
Figure 2 for Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions
Figure 3 for Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions
Figure 4 for Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions
Viaarxiv icon

Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks

Add code
Jul 11, 2024
Viaarxiv icon

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Add code
Jul 07, 2024
Viaarxiv icon

Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training

Add code
Jun 27, 2024
Figure 1 for Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training
Figure 2 for Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training
Figure 3 for Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training
Figure 4 for Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training
Viaarxiv icon

Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native

Add code
Jan 17, 2024
Figure 1 for Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native
Figure 2 for Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native
Viaarxiv icon