Picture for Tatsuya Hiraoka

Tatsuya Hiraoka

Repetition Neurons: How Do Language Models Produce Repetitions?

Add code
Oct 17, 2024
Viaarxiv icon

The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces

Add code
Oct 17, 2024
Viaarxiv icon

SubRegWeigh: Effective and Efficient Annotation Weighing with Subword Regularization

Add code
Sep 10, 2024
Viaarxiv icon

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Add code
Jul 04, 2024
Figure 1 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 2 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 3 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 4 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Viaarxiv icon

An Analysis of BPE Vocabulary Trimming in Neural Machine Translation

Add code
Mar 30, 2024
Viaarxiv icon

Knowledge of Pretrained Language Models on Surface Information of Tokens

Add code
Feb 22, 2024
Viaarxiv icon

Tokenization Tractability for Human and Machine Learning Model: An Annotation Study

Add code
Apr 21, 2023
Viaarxiv icon

Downstream Task-Oriented Neural Tokenizer Optimization with Vocabulary Restriction as Post Processing

Add code
Apr 21, 2023
Viaarxiv icon

MaxMatch-Dropout: Subword Regularization for WordPiece

Add code
Sep 09, 2022
Figure 1 for MaxMatch-Dropout: Subword Regularization for WordPiece
Figure 2 for MaxMatch-Dropout: Subword Regularization for WordPiece
Figure 3 for MaxMatch-Dropout: Subword Regularization for WordPiece
Figure 4 for MaxMatch-Dropout: Subword Regularization for WordPiece
Viaarxiv icon

Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation

Add code
Mar 25, 2022
Figure 1 for Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Figure 2 for Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Figure 3 for Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Figure 4 for Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Viaarxiv icon