Picture for Zouying Cao

Zouying Cao

KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing

Add code
Oct 24, 2024
Figure 1 for KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
Figure 2 for KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
Figure 3 for KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
Figure 4 for KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
Viaarxiv icon

Nothing in Excess: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering

Add code
Aug 21, 2024
Viaarxiv icon

Head-wise Shareable Attention for Large Language Models

Add code
Feb 19, 2024
Figure 1 for Head-wise Shareable Attention for Large Language Models
Figure 2 for Head-wise Shareable Attention for Large Language Models
Figure 3 for Head-wise Shareable Attention for Large Language Models
Figure 4 for Head-wise Shareable Attention for Large Language Models
Viaarxiv icon

LaCo: Large Language Model Pruning via Layer Collapse

Add code
Feb 17, 2024
Viaarxiv icon

AutoHall: Automated Hallucination Dataset Generation for Large Language Models

Add code
Sep 30, 2023
Viaarxiv icon