Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Jul 01, 2024

Yiming Zhang, Yicheng Gu, Yanhong Zeng, Zhening Xing, Yuancheng Wang, Zhizheng Wu, Kai Chen

Figure 1 for FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Figure 2 for FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Figure 3 for FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Figure 4 for FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Share this with someone who'll enjoy it:

Abstract:We study Neural Foley, the automatic generation of high-quality sound effects synchronizing with videos, enabling an immersive audio-visual experience. Despite its wide range of applications, existing approaches encounter limitations when it comes to simultaneously synthesizing high-quality and video-aligned (i.e.,, semantic relevant and temporal synchronized) sounds. To overcome these limitations, we propose FoleyCrafter, a novel framework that leverages a pre-trained text-to-audio model to ensure high-quality audio generation. FoleyCrafter comprises two key components: the semantic adapter for semantic alignment and the temporal controller for precise audio-video synchronization. The semantic adapter utilizes parallel cross-attention layers to condition audio generation on video features, producing realistic sound effects that are semantically relevant to the visual content. Meanwhile, the temporal controller incorporates an onset detector and a timestampbased adapter to achieve precise audio-video alignment. One notable advantage of FoleyCrafter is its compatibility with text prompts, enabling the use of text descriptions to achieve controllable and diverse video-to-audio generation according to user intents. We conduct extensive quantitative and qualitative experiments on standard benchmarks to verify the effectiveness of FoleyCrafter. Models and codes are available at https://github.com/open-mmlab/FoleyCrafter.

* Project page: https://foleycrafter.github.io/

View paper on

Share this with someone who'll enjoy it:

Title:FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Paper and Code