Picture for Ang Lv

Ang Lv

More Expressive Attention with Negative Weights

Add code
Nov 14, 2024
Viaarxiv icon

HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation

Add code
Oct 28, 2024
Figure 1 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 2 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 3 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 4 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Viaarxiv icon

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

Add code
Sep 29, 2024
Figure 1 for PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Figure 2 for PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Figure 3 for PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Figure 4 for PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Viaarxiv icon

Language Models "Grok" to Copy

Add code
Sep 14, 2024
Viaarxiv icon

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules

Add code
Jul 09, 2024
Figure 1 for Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Figure 2 for Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Figure 3 for Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Figure 4 for Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Viaarxiv icon

Mixture of In-Context Experts Enhance LLMs' Long Context Awareness

Add code
Jun 28, 2024
Figure 1 for Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
Figure 2 for Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
Figure 3 for Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
Figure 4 for Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
Viaarxiv icon

Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models

Add code
Apr 09, 2024
Viaarxiv icon

Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models

Add code
Mar 04, 2024
Viaarxiv icon

Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning

Add code
Jan 12, 2024
Viaarxiv icon

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

Add code
Dec 07, 2023
Figure 1 for Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Figure 2 for Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Figure 3 for Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Figure 4 for Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Viaarxiv icon