Picture for Nakamasa Inoue

Nakamasa Inoue

Multi-Point Positional Insertion Tuning for Small Object Detection

Add code
Dec 24, 2024
Viaarxiv icon

HarmonicEval: Multi-modal, Multi-task, Multi-criteria Automatic Evaluation Using a Vision Language Model

Add code
Dec 19, 2024
Figure 1 for HarmonicEval: Multi-modal, Multi-task, Multi-criteria Automatic Evaluation Using a Vision Language Model
Figure 2 for HarmonicEval: Multi-modal, Multi-task, Multi-criteria Automatic Evaluation Using a Vision Language Model
Figure 3 for HarmonicEval: Multi-modal, Multi-task, Multi-criteria Automatic Evaluation Using a Vision Language Model
Figure 4 for HarmonicEval: Multi-modal, Multi-task, Multi-criteria Automatic Evaluation Using a Vision Language Model
Viaarxiv icon

HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis

Add code
Oct 06, 2024
Figure 1 for HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Figure 2 for HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Figure 3 for HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Figure 4 for HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Viaarxiv icon

Rethinking Image Super-Resolution from Training Data Perspectives

Add code
Sep 01, 2024
Viaarxiv icon

Scaling Backwards: Minimal Synthetic Pre-training?

Add code
Aug 03, 2024
Figure 1 for Scaling Backwards: Minimal Synthetic Pre-training?
Figure 2 for Scaling Backwards: Minimal Synthetic Pre-training?
Figure 3 for Scaling Backwards: Minimal Synthetic Pre-training?
Figure 4 for Scaling Backwards: Minimal Synthetic Pre-training?
Viaarxiv icon

Pyramid Coder: Hierarchical Code Generator for Compositional Visual Question Answering

Add code
Jul 30, 2024
Viaarxiv icon

ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks

Add code
Jul 28, 2024
Viaarxiv icon

AdaCoder: Adaptive Prompt Compression for Programmatic Visual Question Answering

Add code
Jul 28, 2024
Viaarxiv icon

CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information

Add code
Jun 20, 2024
Viaarxiv icon

CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud Data

Add code
Oct 28, 2023
Viaarxiv icon