Picture for Yasunori Ohishi

Yasunori Ohishi

M2D2: Exploring General-purpose Audio-Language Representations Beyond CLAP

Add code
Mar 28, 2025
Viaarxiv icon

Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes

Add code
Mar 28, 2025
Viaarxiv icon

M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation

Add code
Jun 04, 2024
Figure 1 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 2 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 3 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 4 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Viaarxiv icon

Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection

Add code
Apr 26, 2024
Figure 1 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 2 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 3 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 4 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Viaarxiv icon

Guided Masked Self-Distillation Modeling for Distributed Multimedia Sensor Event Analysis

Add code
Apr 12, 2024
Viaarxiv icon

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Add code
Apr 09, 2024
Figure 1 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 2 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 3 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 4 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Viaarxiv icon

Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval

Add code
Mar 16, 2024
Figure 1 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 2 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 3 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 4 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Viaarxiv icon

Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Add code
Aug 23, 2023
Figure 1 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Figure 2 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Figure 3 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Viaarxiv icon

Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation

Add code
May 23, 2023
Viaarxiv icon

First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline

Add code
Mar 01, 2023
Figure 1 for First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline
Figure 2 for First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline
Figure 3 for First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline
Viaarxiv icon