Picture for Taishi Nakamura

Taishi Nakamura

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Add code
Mar 06, 2025
Viaarxiv icon

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Add code
Feb 26, 2025
Viaarxiv icon

Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

Add code
Dec 19, 2024
Figure 1 for Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Figure 2 for Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Figure 3 for Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Figure 4 for Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Viaarxiv icon

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

Add code
Nov 10, 2024
Viaarxiv icon

Agent Skill Acquisition for Large Language Models via CycleQD

Add code
Oct 16, 2024
Figure 1 for Agent Skill Acquisition for Large Language Models via CycleQD
Figure 2 for Agent Skill Acquisition for Large Language Models via CycleQD
Figure 3 for Agent Skill Acquisition for Large Language Models via CycleQD
Figure 4 for Agent Skill Acquisition for Large Language Models via CycleQD
Viaarxiv icon

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Add code
Jul 04, 2024
Figure 1 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 2 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 3 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 4 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Viaarxiv icon

Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities

Add code
Apr 27, 2024
Figure 1 for Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities
Figure 2 for Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities
Figure 3 for Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities
Figure 4 for Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities
Viaarxiv icon

Building a Large Japanese Web Corpus for Large Language Models

Add code
Apr 27, 2024
Figure 1 for Building a Large Japanese Web Corpus for Large Language Models
Figure 2 for Building a Large Japanese Web Corpus for Large Language Models
Figure 3 for Building a Large Japanese Web Corpus for Large Language Models
Viaarxiv icon

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Add code
Mar 30, 2024
Figure 1 for Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Figure 2 for Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Figure 3 for Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Figure 4 for Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Viaarxiv icon