Picture for Yuki Saito

Yuki Saito

Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features

Add code
Dec 26, 2024
Viaarxiv icon

An Environment-Adaptive Position/Force Control Based on Physical Property Estimation

Add code
Dec 19, 2024
Viaarxiv icon

An Empirical Analysis of GPT-4V's Performance on Fashion Aesthetic Evaluation

Add code
Oct 31, 2024
Viaarxiv icon

Construction and Analysis of Impression Caption Dataset for Environmental Sounds

Add code
Oct 20, 2024
Viaarxiv icon

Disentangling Likes and Dislikes in Personalized Generative Explainable Recommendation

Add code
Oct 17, 2024
Figure 1 for Disentangling Likes and Dislikes in Personalized Generative Explainable Recommendation
Figure 2 for Disentangling Likes and Dislikes in Personalized Generative Explainable Recommendation
Figure 3 for Disentangling Likes and Dislikes in Personalized Generative Explainable Recommendation
Figure 4 for Disentangling Likes and Dislikes in Personalized Generative Explainable Recommendation
Viaarxiv icon

The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech

Add code
Sep 14, 2024
Viaarxiv icon

Cross-Dialect Text-To-Speech in Pitch-Accent Language Incorporating Multi-Dialect Phoneme-Level BERT

Add code
Sep 11, 2024
Viaarxiv icon

A Fashion Item Recommendation Model in Hyperbolic Space

Add code
Sep 04, 2024
Viaarxiv icon

J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling

Add code
Jul 22, 2024
Figure 1 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 2 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 3 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 4 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Viaarxiv icon

Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals

Add code
Jun 25, 2024
Viaarxiv icon