Picture for Ta-Chung Chi

Ta-Chung Chi

Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation

Add code
Nov 15, 2023
Figure 1 for Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Figure 2 for Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Figure 3 for Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Figure 4 for Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Viaarxiv icon

Advancing Regular Language Reasoning in Linear Recurrent Neural Networks

Add code
Sep 14, 2023
Viaarxiv icon

Structured Dialogue Discourse Parsing

Add code
Jun 26, 2023
Viaarxiv icon

PESCO: Prompt-enhanced Self Contrastive Learning for Zero-shot Text Classification

Add code
May 24, 2023
Viaarxiv icon

Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings

Add code
May 23, 2023
Figure 1 for Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Figure 2 for Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Figure 3 for Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Figure 4 for Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Viaarxiv icon

Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation

Add code
May 05, 2023
Figure 1 for Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Figure 2 for Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Figure 3 for Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Figure 4 for Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Viaarxiv icon

Receptive Field Alignment Enables Transformer Length Extrapolation

Add code
Dec 20, 2022
Viaarxiv icon

On Task-Adaptive Pretraining for Dialogue Response Selection

Add code
Oct 08, 2022
Figure 1 for On Task-Adaptive Pretraining for Dialogue Response Selection
Figure 2 for On Task-Adaptive Pretraining for Dialogue Response Selection
Figure 3 for On Task-Adaptive Pretraining for Dialogue Response Selection
Figure 4 for On Task-Adaptive Pretraining for Dialogue Response Selection
Viaarxiv icon

Training Discrete Deep Generative Models via Gapped Straight-Through Estimator

Add code
Jun 15, 2022
Figure 1 for Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Figure 2 for Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Figure 3 for Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Figure 4 for Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Viaarxiv icon

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

Add code
May 20, 2022
Figure 1 for KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Figure 2 for KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Figure 3 for KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Figure 4 for KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Viaarxiv icon