Picture for Shuhei Kurita

Shuhei Kurita

Developing Vision-Language-Action Model from Egocentric Videos

Add code
Sep 26, 2025
Viaarxiv icon

Llama-Mimi: Speech Language Models with Interleaved Semantic and Acoustic Tokens

Add code
Sep 18, 2025
Figure 1 for Llama-Mimi: Speech Language Models with Interleaved Semantic and Acoustic Tokens
Figure 2 for Llama-Mimi: Speech Language Models with Interleaved Semantic and Acoustic Tokens
Figure 3 for Llama-Mimi: Speech Language Models with Interleaved Semantic and Acoustic Tokens
Figure 4 for Llama-Mimi: Speech Language Models with Interleaved Semantic and Acoustic Tokens
Viaarxiv icon

LegalViz: Legal Text Visualization by Text To Diagram Generation

Add code
Feb 10, 2025
Viaarxiv icon

Constructing Multimodal Datasets from Scratch for Rapid Development of a Japanese Visual Language Model

Add code
Oct 30, 2024
Figure 1 for Constructing Multimodal Datasets from Scratch for Rapid Development of a Japanese Visual Language Model
Figure 2 for Constructing Multimodal Datasets from Scratch for Rapid Development of a Japanese Visual Language Model
Figure 3 for Constructing Multimodal Datasets from Scratch for Rapid Development of a Japanese Visual Language Model
Figure 4 for Constructing Multimodal Datasets from Scratch for Rapid Development of a Japanese Visual Language Model
Viaarxiv icon

AdaCoder: Adaptive Prompt Compression for Programmatic Visual Question Answering

Add code
Jul 28, 2024
Viaarxiv icon

Answerability Fields: Answerable Location Estimation via Diffusion Models

Add code
Jul 26, 2024
Figure 1 for Answerability Fields: Answerable Location Estimation via Diffusion Models
Figure 2 for Answerability Fields: Answerable Location Estimation via Diffusion Models
Figure 3 for Answerability Fields: Answerable Location Estimation via Diffusion Models
Figure 4 for Answerability Fields: Answerable Location Estimation via Diffusion Models
Viaarxiv icon

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Add code
Jul 04, 2024
Figure 1 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 2 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 3 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 4 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Viaarxiv icon

CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information

Add code
Jun 20, 2024
Viaarxiv icon

Map-based Modular Approach for Zero-shot Embodied Question Answering

Add code
May 26, 2024
Viaarxiv icon

Text-driven Affordance Learning from Egocentric Vision

Add code
Apr 03, 2024
Viaarxiv icon