Picture for Joseph E. Gonzalez

Joseph E. Gonzalez

Autellix: An Efficient Serving Engine for LLM Agents as General Programs

Add code
Feb 19, 2025
Viaarxiv icon

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Add code
Feb 12, 2025
Viaarxiv icon

LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Add code
Feb 11, 2025
Viaarxiv icon

Adaptive Semantic Prompt Caching with VectorQ

Add code
Feb 06, 2025
Viaarxiv icon

BARE: Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation

Add code
Feb 03, 2025
Viaarxiv icon

HashAttention: Semantic Sparsity for Faster Inference

Add code
Dec 19, 2024
Figure 1 for HashAttention: Semantic Sparsity for Faster Inference
Figure 2 for HashAttention: Semantic Sparsity for Faster Inference
Figure 3 for HashAttention: Semantic Sparsity for Faster Inference
Figure 4 for HashAttention: Semantic Sparsity for Faster Inference
Viaarxiv icon

VisionArena: 230K Real World User-VLM Conversations with Preference Labels

Add code
Dec 11, 2024
Viaarxiv icon

MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs

Add code
Nov 18, 2024
Viaarxiv icon

Managing Bandwidth: The Key to Cloud-Assisted Autonomous Driving

Add code
Oct 21, 2024
Viaarxiv icon

How to Evaluate Reward Models for RLHF

Add code
Oct 18, 2024
Figure 1 for How to Evaluate Reward Models for RLHF
Figure 2 for How to Evaluate Reward Models for RLHF
Figure 3 for How to Evaluate Reward Models for RLHF
Figure 4 for How to Evaluate Reward Models for RLHF
Viaarxiv icon