Picture for Mark D. Plumbley

Mark D. Plumbley

AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language Models

Add code
Nov 28, 2024
Viaarxiv icon

PSELDNets: Pre-trained Neural Networks on Large-scale Synthetic Datasets for Sound Event Localization and Detection

Add code
Nov 10, 2024
Viaarxiv icon

A decade of DCASE: Achievements, practices, evaluations and future challenges

Add code
Oct 07, 2024
Viaarxiv icon

The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection

Add code
Sep 17, 2024
Viaarxiv icon

FlowSep: Language-Queried Sound Separation with Rectified Flow Matching

Add code
Sep 11, 2024
Viaarxiv icon

Exploring Differences between Human Perception and Model Inference in Audio Event Recognition

Add code
Sep 10, 2024
Viaarxiv icon

Integrating IP Broadcasting with Audio Tags: Workflow and Challenges

Add code
Jul 23, 2024
Viaarxiv icon

Efficient Audio Captioning with Encoder-Level Knowledge Distillation

Add code
Jul 19, 2024
Viaarxiv icon

Universal Sound Separation with Self-Supervised Audio Masked Autoencoder

Add code
Jul 16, 2024
Viaarxiv icon

Improving Audio Generation with Visual Enhanced Caption

Add code
Jul 05, 2024
Viaarxiv icon