Picture for Rio Yokota

Rio Yokota

Improving LoRA with Variational Learning

Add code
Jun 17, 2025
Viaarxiv icon

Variational Learning Finds Flatter Solutions at the Edge of Stability

Add code
Jun 15, 2025
Viaarxiv icon

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

Add code
May 05, 2025
Viaarxiv icon

Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models

Add code
Mar 31, 2025
Viaarxiv icon

On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process

Add code
Mar 04, 2025
Viaarxiv icon

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Add code
Feb 26, 2025
Viaarxiv icon

Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

Add code
Dec 19, 2024
Figure 1 for Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Figure 2 for Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Figure 3 for Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Figure 4 for Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Viaarxiv icon

Lion Cub: Minimizing Communication Overhead in Distributed Lion

Add code
Nov 25, 2024
Viaarxiv icon

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

Add code
Nov 10, 2024
Viaarxiv icon

Variational Low-Rank Adaptation Using IVON

Add code
Nov 07, 2024
Figure 1 for Variational Low-Rank Adaptation Using IVON
Figure 2 for Variational Low-Rank Adaptation Using IVON
Figure 3 for Variational Low-Rank Adaptation Using IVON
Figure 4 for Variational Low-Rank Adaptation Using IVON
Viaarxiv icon