Picture for Tatsuya Komatsu

Tatsuya Komatsu

Music Tagging with Classifier Group Chains

Add code
Jan 09, 2025
Viaarxiv icon

Pre-training with Synthetic Patterns for Audio

Add code
Oct 01, 2024
Viaarxiv icon

DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information

Add code
Sep 18, 2024
Figure 1 for DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information
Figure 2 for DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information
Figure 3 for DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information
Figure 4 for DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information
Viaarxiv icon

Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection

Add code
Aug 06, 2024
Figure 1 for Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection
Figure 2 for Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection
Figure 3 for Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection
Figure 4 for Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection
Viaarxiv icon

Audio Fingerprinting with Holographic Reduced Representations

Add code
Jun 19, 2024
Viaarxiv icon

Universal Score-based Speech Enhancement with High Content Preservation

Add code
Jun 18, 2024
Viaarxiv icon

Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech Recognisers

Add code
Jan 22, 2024
Figure 1 for Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech Recognisers
Figure 2 for Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech Recognisers
Figure 3 for Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech Recognisers
Figure 4 for Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech Recognisers
Viaarxiv icon

Audio Difference Learning for Audio Captioning

Add code
Sep 15, 2023
Figure 1 for Audio Difference Learning for Audio Captioning
Figure 2 for Audio Difference Learning for Audio Captioning
Figure 3 for Audio Difference Learning for Audio Captioning
Viaarxiv icon

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

Add code
Sep 15, 2023
Viaarxiv icon

Neural Diarization with Non-autoregressive Intermediate Attractors

Add code
Mar 13, 2023
Viaarxiv icon