Picture for Ryosuke Takahashi

Ryosuke Takahashi

Relaxing Positional Alignment in Masked Diffusion Language Models

Add code
Jan 30, 2026
Viaarxiv icon

Suppressing Final Layer Hidden State Jumps in Transformer Pretraining

Add code
Jan 26, 2026
Viaarxiv icon

Can Language Models Handle a Non-Gregorian Calendar?

Add code
Sep 04, 2025
Viaarxiv icon

Layerwise Importance Analysis of Feed-Forward Networks in Transformer-based Language Models

Add code
Aug 25, 2025
Viaarxiv icon

TopK Language Models

Add code
Jun 26, 2025
Viaarxiv icon

The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models

Add code
Jun 10, 2024
Figure 1 for The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models
Figure 2 for The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models
Figure 3 for The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models
Figure 4 for The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models
Viaarxiv icon