Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Coordinate In and Value Out: Training Flow Transformers in Ambient Space

Dec 05, 2024

Yuyang Wang, Anurag Ranjan, Josh Susskind, Miguel Angel Bautista

Figure 1 for Coordinate In and Value Out: Training Flow Transformers in Ambient Space

Figure 2 for Coordinate In and Value Out: Training Flow Transformers in Ambient Space

Figure 3 for Coordinate In and Value Out: Training Flow Transformers in Ambient Space

Figure 4 for Coordinate In and Value Out: Training Flow Transformers in Ambient Space

Share this with someone who'll enjoy it:

Abstract:Flow matching models have emerged as a powerful method for generative modeling on domains like images or videos, and even on unstructured data like 3D point clouds. These models are commonly trained in two stages: first, a data compressor (i.e., a variational auto-encoder) is trained, and in a subsequent training stage a flow matching generative model is trained in the low-dimensional latent space of the data compressor. This two stage paradigm adds complexity to the overall training recipe and sets obstacles for unifying models across data domains, as specific data compressors are used for different data modalities. To this end, we introduce Ambient Space Flow Transformers (ASFT), a domain-agnostic approach to learn flow matching transformers in ambient space, sidestepping the requirement of training compressors and simplifying the training process. We introduce a conditionally independent point-wise training objective that enables ASFT to make predictions continuously in coordinate space. Our empirical results demonstrate that using general purpose transformer blocks, ASFT effectively handles different data modalities such as images and 3D point clouds, achieving strong performance in both domains and outperforming comparable approaches. ASFT is a promising step towards domain-agnostic flow matching generative models that can be trivially adopted in different data domains.

* 23 pages, 10 figures, 10 tables

View paper on

Share this with someone who'll enjoy it:

Title:Coordinate In and Value Out: Training Flow Transformers in Ambient Space

Paper and Code