Picture for Keisuke Imoto

Keisuke Imoto

DOA-Aware Audio-Visual Self-Supervised Learning for Sound Event Localization and Detection

Add code
Oct 30, 2024
Viaarxiv icon

Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation

Add code
Oct 23, 2024
Figure 1 for Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation
Figure 2 for Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation
Figure 3 for Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation
Figure 4 for Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation
Viaarxiv icon

Construction and Analysis of Impression Caption Dataset for Environmental Sounds

Add code
Oct 20, 2024
Viaarxiv icon

LEAD Dataset: How Can Labels for Sound Event Detection Vary Depending on Annotators?

Add code
Oct 13, 2024
Viaarxiv icon

Description and Discussion on DCASE 2024 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring

Add code
Jun 11, 2024
Figure 1 for Description and Discussion on DCASE 2024 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Viaarxiv icon

M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation

Add code
Jun 04, 2024
Figure 1 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 2 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 3 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 4 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Viaarxiv icon

Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant

Add code
Mar 26, 2024
Figure 1 for Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Figure 2 for Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Figure 3 for Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Figure 4 for Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Viaarxiv icon

Discriminative Neighborhood Smoothing for Generative Anomalous Sound Detection

Add code
Mar 18, 2024
Figure 1 for Discriminative Neighborhood Smoothing for Generative Anomalous Sound Detection
Figure 2 for Discriminative Neighborhood Smoothing for Generative Anomalous Sound Detection
Figure 3 for Discriminative Neighborhood Smoothing for Generative Anomalous Sound Detection
Figure 4 for Discriminative Neighborhood Smoothing for Generative Anomalous Sound Detection
Viaarxiv icon

Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval

Add code
Mar 16, 2024
Figure 1 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 2 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 3 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Figure 4 for Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Viaarxiv icon

F1-EV Score: Measuring the Likelihood of Estimating a Good Decision Threshold for Semi-Supervised Anomaly Detection

Add code
Dec 14, 2023
Viaarxiv icon