Picture for Mehdi Rezagholizadeh

Mehdi Rezagholizadeh

Huawei Noah's Ark Lab

Batch-Max: Higher LLM Throughput using Larger Batch Sizes and KV Cache Compression

Add code
Dec 07, 2024
Viaarxiv icon

Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination

Add code
Oct 22, 2024
Viaarxiv icon

Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity

Add code
Oct 01, 2024
Viaarxiv icon

EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models

Add code
Sep 22, 2024
Figure 1 for EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models
Figure 2 for EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models
Figure 3 for EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models
Figure 4 for EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models
Viaarxiv icon

Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models

Add code
Aug 16, 2024
Viaarxiv icon

S2D: Sorted Speculative Decoding For More Efficient Deployment of Nested Large Language Models

Add code
Jul 02, 2024
Viaarxiv icon

EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems

Add code
Jun 14, 2024
Figure 1 for EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems
Figure 2 for EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems
Figure 3 for EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems
Figure 4 for EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems
Viaarxiv icon

CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search

Add code
Jun 07, 2024
Viaarxiv icon

OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection

Add code
Jun 04, 2024
Viaarxiv icon

CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems

Add code
May 24, 2024
Viaarxiv icon