Picture for Alexander I. Rudnicky

Alexander I. Rudnicky

Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation

Add code
Nov 15, 2023
Figure 1 for Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Figure 2 for Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Figure 3 for Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Figure 4 for Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Viaarxiv icon

Advancing Regular Language Reasoning in Linear Recurrent Neural Networks

Add code
Sep 14, 2023
Viaarxiv icon

Structured Dialogue Discourse Parsing

Add code
Jun 26, 2023
Viaarxiv icon

Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings

Add code
May 23, 2023
Figure 1 for Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Figure 2 for Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Figure 3 for Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Figure 4 for Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Viaarxiv icon

Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation

Add code
May 05, 2023
Figure 1 for Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Figure 2 for Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Figure 3 for Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Figure 4 for Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Viaarxiv icon

Receptive Field Alignment Enables Transformer Length Extrapolation

Add code
Dec 20, 2022
Viaarxiv icon

Training Discrete Deep Generative Models via Gapped Straight-Through Estimator

Add code
Jun 15, 2022
Figure 1 for Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Figure 2 for Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Figure 3 for Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Figure 4 for Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Viaarxiv icon

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

Add code
May 20, 2022
Figure 1 for KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Figure 2 for KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Figure 3 for KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Figure 4 for KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Viaarxiv icon

Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection

Add code
Oct 25, 2021
Figure 1 for Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection
Figure 2 for Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection
Figure 3 for Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection
Figure 4 for Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection
Viaarxiv icon

Learning Conversational Systems that Interleave Task and Non-Task Content

Add code
Mar 01, 2017
Figure 1 for Learning Conversational Systems that Interleave Task and Non-Task Content
Figure 2 for Learning Conversational Systems that Interleave Task and Non-Task Content
Figure 3 for Learning Conversational Systems that Interleave Task and Non-Task Content
Figure 4 for Learning Conversational Systems that Interleave Task and Non-Task Content
Viaarxiv icon