Picture for Hitomi Yanaka

Hitomi Yanaka

J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling

Add code
Jul 22, 2024
Figure 1 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 2 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 3 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 4 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Viaarxiv icon

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Add code
Jul 04, 2024
Viaarxiv icon

Evaluating Structural Generalization in Neural Machine Translation

Add code
Jun 19, 2024
Viaarxiv icon

Exploring Intra and Inter-language Consistency in Embeddings with ICA

Add code
Jun 18, 2024
Viaarxiv icon

Analyzing Social Biases in Japanese Large Language Models

Add code
Jun 04, 2024
Viaarxiv icon

Topic Modeling for Short Texts with Large Language Models

Add code
Jun 02, 2024
Figure 1 for Topic Modeling for Short Texts with Large Language Models
Figure 2 for Topic Modeling for Short Texts with Large Language Models
Figure 3 for Topic Modeling for Short Texts with Large Language Models
Figure 4 for Topic Modeling for Short Texts with Large Language Models
Viaarxiv icon

On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons

Add code
Apr 03, 2024
Figure 1 for On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons
Figure 2 for On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons
Figure 3 for On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons
Figure 4 for On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons
Viaarxiv icon

Constructing Multilingual Code Search Dataset Using Neural Machine Translation

Add code
Jun 27, 2023
Viaarxiv icon

Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating Generalization Capacity of Language Models

Add code
Jun 19, 2023
Viaarxiv icon

Analyzing Syntactic Generalization Capacity of Pre-trained Language Models on Japanese Honorific Conversion

Add code
Jun 05, 2023
Viaarxiv icon