Picture for Osamu Yoshie

Osamu Yoshie

Where, What, Why: Toward Explainable 3D-GS Watermarking

Add code
Mar 09, 2026
Viaarxiv icon

Chart Specification: Structural Representations for Incentivizing VLM Reasoning in Chart-to-Code Generation

Add code
Feb 11, 2026
Viaarxiv icon

FluencyVE: Marrying Temporal-Aware Mamba with Bypass Attention for Video Editing

Add code
Dec 24, 2025
Viaarxiv icon

Multi-Attribute guided Thermal Face Image Translation based on Latent Diffusion Model

Add code
Dec 24, 2025
Viaarxiv icon

DM$^3$Net: Dual-Camera Super-Resolution via Domain Modulation and Multi-scale Matching

Add code
Jun 08, 2025
Viaarxiv icon

PADriver: Towards Personalized Autonomous Driving

Add code
May 08, 2025
Viaarxiv icon

R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning

Add code
Feb 27, 2025
Figure 1 for R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
Figure 2 for R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
Figure 3 for R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
Figure 4 for R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
Viaarxiv icon

RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World

Add code
Nov 29, 2024
Figure 1 for RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Figure 2 for RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Figure 3 for RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Figure 4 for RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Viaarxiv icon

What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance

Add code
Aug 23, 2024
Figure 1 for What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance
Figure 2 for What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance
Figure 3 for What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance
Figure 4 for What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance
Viaarxiv icon

MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment

Add code
Jun 28, 2024
Figure 1 for MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Figure 2 for MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Figure 3 for MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Figure 4 for MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Viaarxiv icon