Picture for Daisuke Niizumi

Daisuke Niizumi

M2D2: Exploring General-purpose Audio-Language Representations Beyond CLAP

Add code
Mar 28, 2025
Viaarxiv icon

Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes

Add code
Mar 28, 2025
Viaarxiv icon

Description and Discussion on DCASE 2024 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring

Add code
Jun 11, 2024
Figure 1 for Description and Discussion on DCASE 2024 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Viaarxiv icon

M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation

Add code
Jun 04, 2024
Figure 1 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 2 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 3 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 4 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Viaarxiv icon

Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection

Add code
Apr 26, 2024
Figure 1 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 2 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 3 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Figure 4 for Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Viaarxiv icon

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Add code
Apr 09, 2024
Figure 1 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 2 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 3 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Figure 4 for Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Viaarxiv icon

Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval

Add code
Mar 16, 2024
Figure 1 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 2 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 3 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 4 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Viaarxiv icon

Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Add code
Aug 23, 2023
Figure 1 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Figure 2 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Figure 3 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Viaarxiv icon

Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation

Add code
May 23, 2023
Viaarxiv icon

Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring

Add code
May 13, 2023
Figure 1 for Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Figure 2 for Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Figure 3 for Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Figure 4 for Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Viaarxiv icon