Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexander Wang

Auptimize: Optimal Placement of Spatial Audio Cues for Extended Reality

Aug 18, 2024

Hyunsung Cho, Alexander Wang, Divya Kartik, Emily Liying Xie, Yukang Yan, David Lindlbauer

Figure 1 for Auptimize: Optimal Placement of Spatial Audio Cues for Extended Reality

Figure 2 for Auptimize: Optimal Placement of Spatial Audio Cues for Extended Reality

Figure 3 for Auptimize: Optimal Placement of Spatial Audio Cues for Extended Reality

Figure 4 for Auptimize: Optimal Placement of Spatial Audio Cues for Extended Reality

Abstract:Spatial audio in Extended Reality (XR) provides users with better awareness of where virtual elements are placed, and efficiently guides them to events such as notifications, system alerts from different windows, or approaching avatars. Humans, however, are inaccurate in localizing sound cues, especially with multiple sources due to limitations in human auditory perception such as angular discrimination error and front-back confusion. This decreases the efficiency of XR interfaces because users misidentify from which XR element a sound is coming. To address this, we propose Auptimize, a novel computational approach for placing XR sound sources, which mitigates such localization errors by utilizing the ventriloquist effect. Auptimize disentangles the sound source locations from the visual elements and relocates the sound sources to optimal positions for unambiguous identification of sound cues, avoiding errors due to inter-source proximity and front-back confusion. Our evaluation shows that Auptimize decreases spatial audio-based source identification errors compared to playing sound cues at the paired visual-sound locations. We demonstrate the applicability of Auptimize for diverse spatial audio-based interactive XR scenarios.

* UIST 2024

Via

Access Paper or Ask Questions

STAT: Shrinking Transformers After Training

May 29, 2024

Megan Flynn, Alexander Wang, Dean Edward Alvarez, Christopher De Sa, Anil Damle

Figure 1 for STAT: Shrinking Transformers After Training

Figure 2 for STAT: Shrinking Transformers After Training

Figure 3 for STAT: Shrinking Transformers After Training

Figure 4 for STAT: Shrinking Transformers After Training

Abstract:We present STAT: a simple algorithm to prune transformer models without any fine-tuning. STAT eliminates both attention heads and neurons from the network, while preserving accuracy by calculating a correction to the weights of the next layer. Each layer block in the network is compressed using a series of principled matrix factorizations that preserve the network structure. Our entire algorithm takes minutes to compress BERT, and less than three hours to compress models with 7B parameters using a single GPU. Using only several hundred data examples, STAT preserves the output of the network and improves upon existing gradient-free pruning methods. It is even competitive with methods that include significant fine-tuning. We demonstrate our method on both encoder and decoder architectures, including BERT, DistilBERT, and Llama-2 using benchmarks such as GLUE, Squad, WikiText2.

Via

Access Paper or Ask Questions

SketchEmbedNet: Learning Novel Concepts by Imitating Drawings

Aug 27, 2020

Alexander Wang, Mengye Ren, Richard Zemel

Figure 1 for SketchEmbedNet: Learning Novel Concepts by Imitating Drawings

Figure 2 for SketchEmbedNet: Learning Novel Concepts by Imitating Drawings

Figure 3 for SketchEmbedNet: Learning Novel Concepts by Imitating Drawings

Figure 4 for SketchEmbedNet: Learning Novel Concepts by Imitating Drawings

Abstract:Sketch drawings are an intuitive visual domain that generally preserves semantics. Previous work has shown that recurrent neural networks are capable of producing sketch drawings of a single or few classes at a time. In this work we focus on the representations developed by training a generative model to produce sketches from pixel images across many classes in a sketch domain. We find that the embeddings learned by this sketching model are extremely informative for visual tasks and infer compositional information. We then use them to exceed state-of-the-art performance in unsupervised few-shot classification on the Omniglot and mini-ImageNet benchmarks. We also leverage the generative capacity of our model to produce high quality sketches of novel classes based on just a single example.

Via

Access Paper or Ask Questions

Covering up bias in CelebA-like datasets with Markov blankets: A post-hoc cure for attribute prior avoidance

Jul 22, 2019

Vinay Uday Prabhu, Dian Ang Yap, Alexander Wang, John Whaley

Figure 1 for Covering up bias in CelebA-like datasets with Markov blankets: A post-hoc cure for attribute prior avoidance

Figure 2 for Covering up bias in CelebA-like datasets with Markov blankets: A post-hoc cure for attribute prior avoidance

Figure 3 for Covering up bias in CelebA-like datasets with Markov blankets: A post-hoc cure for attribute prior avoidance

Figure 4 for Covering up bias in CelebA-like datasets with Markov blankets: A post-hoc cure for attribute prior avoidance

Abstract:Attribute prior avoidance entails subconscious or willful non-modeling of (meta)attributes that datasets are oft born with, such as the 40 semantic facial attributes associated with the CelebA and CelebA-HQ datasets. The consequences of this infirmity, we discover, are especially stark in state-of-the-art deep generative models learned on these datasets that just model the pixel-space measurements, resulting in an inter-attribute bias-laden latent space. This viscerally manifests itself when we perform face manipulation experiments based on latent vector interpolations. In this paper, we address this and propose a post-hoc solution that utilizes an Ising attribute prior learned in the attribute space and showcase its efficacy via qualitative experiments.

* Accepted for presentation at the first workshop on Invertible Neural Networks and Normalizing Flows (ICML 2019), Long Beach, CA, USA

Via

Access Paper or Ask Questions