Picture for Daiki Takeuchi

Daiki Takeuchi

Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes

Add code
Mar 28, 2025
Viaarxiv icon

M2D2: Exploring General-purpose Audio-Language Representations Beyond CLAP

Add code
Mar 28, 2025
Viaarxiv icon

M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation

Add code
Jun 04, 2024
Figure 1 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 2 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 3 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 4 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Viaarxiv icon

Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection

Add code
Apr 26, 2024
Figure 1 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 2 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 3 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 4 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Viaarxiv icon

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Add code
Apr 09, 2024
Figure 1 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 2 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 3 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 4 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Viaarxiv icon

Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval

Add code
Mar 16, 2024
Figure 1 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 2 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 3 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 4 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Viaarxiv icon

Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement with Conformer-based Metric GAN

Add code
Feb 13, 2024
Viaarxiv icon

Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Add code
Aug 23, 2023
Figure 1 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Figure 2 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Figure 3 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Viaarxiv icon

Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation

Add code
May 23, 2023
Viaarxiv icon

Deep sound-field denoiser: optically-measured sound-field denoising using deep neural network

Add code
Apr 27, 2023
Viaarxiv icon