Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrew Lizarraga

Latent Adaptive Planner for Dynamic Manipulation

May 06, 2025

Donghun Noh, Deqian Kong, Minglu Zhao, Andrew Lizarraga, Jianwen Xie, Ying Nian Wu, Dennis Hong

Abstract:This paper presents Latent Adaptive Planner (LAP), a novel approach for dynamic nonprehensile manipulation tasks that formulates planning as latent space inference, effectively learned from human demonstration videos. Our method addresses key challenges in visuomotor policy learning through a principled variational replanning framework that maintains temporal consistency while efficiently adapting to environmental changes. LAP employs Bayesian updating in latent space to incrementally refine plans as new observations become available, striking an optimal balance between computational efficiency and real-time adaptability. We bridge the embodiment gap between humans and robots through model-based proportional mapping that regenerates accurate kinematic-dynamic joint states and object positions from human demonstrations. Experimental evaluations across multiple complex manipulation benchmarks demonstrate that LAP achieves state-of-the-art performance, outperforming existing approaches in success rate, trajectory smoothness, and energy efficiency, particularly in dynamic adaptation scenarios. Our approach enables robots to perform complex interactions with human-like adaptability while providing an expandable framework applicable to diverse robotic platforms using the same human demonstration videos.

Via

Access Paper or Ask Questions

Better Prompt Compression Without Multi-Layer Perceptrons

Jan 12, 2025

Edouardo Honig, Andrew Lizarraga, Zijun Frank Zhang, Ying Nian Wu

Figure 1 for Better Prompt Compression Without Multi-Layer Perceptrons

Figure 2 for Better Prompt Compression Without Multi-Layer Perceptrons

Figure 3 for Better Prompt Compression Without Multi-Layer Perceptrons

Figure 4 for Better Prompt Compression Without Multi-Layer Perceptrons

Abstract:Prompt compression is a promising approach to speeding up language model inference without altering the generative model. Prior works compress prompts into smaller sequences of learned tokens using an encoder that is trained as a LowRank Adaptation (LoRA) of the inference language model. However, we show that the encoder does not need to keep the original language model's architecture to achieve useful compression. We introduce the Attention-Only Compressor (AOC), which learns a prompt compression encoder after removing the multilayer perceptron (MLP) layers in the Transformer blocks of a language model, resulting in an encoder with roughly 67% less parameters compared to the original model. Intriguingly we find that, across a range of compression ratios up to 480x, AOC can better regenerate prompts and outperform a baseline compression encoder that is a LoRA of the inference language model without removing MLP layers. These results demonstrate that the architecture of prompt compression encoders does not need to be identical to that of the original decoder language model, paving the way for further research into architectures and approaches for prompt compression.

* 7 pages, 0 figures

Via

Access Paper or Ask Questions

Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Nov 27, 2024

Andrew Lizarraga, Eric Hanchen Jiang, Jacob Nowack, Yun Qi Li, Ying Nian Wu, Bernie Boscoe, Tuan Do

Figure 1 for Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Figure 2 for Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Figure 3 for Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Figure 4 for Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Abstract:In astrophysics, understanding the evolution of galaxies in primarily through imaging data is fundamental to comprehending the formation of the Universe. This paper introduces a novel approach to conditioning Denoising Diffusion Probabilistic Models (DDPM) on redshifts for generating galaxy images. We explore whether this advanced generative model can accurately capture the physical characteristics of galaxies based solely on their images and redshift measurements. Our findings demonstrate that this model not only produces visually realistic galaxy images but also encodes the underlying changes in physical properties with redshift that are the result of galaxy evolution. This approach marks a significant advancement in using generative models to enhance our scientific insight into cosmic phenomena.

Via

Access Paper or Ask Questions

Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

Nov 25, 2024

Eric Hanchen Jiang, Yasi Zhang, Zhi Zhang, Yixin Wan, Andrew Lizarraga, Shufan Li, Ying Nian Wu

Figure 1 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

Figure 2 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

Figure 3 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

Figure 4 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

Abstract:Text-to-image (T2I) diffusion models have revolutionized generative modeling by producing high-fidelity, diverse, and visually realistic images from textual prompts. Despite these advances, existing models struggle with complex prompts involving multiple objects and attributes, often misaligning modifiers with their corresponding nouns or neglecting certain elements. Recent attention-based methods have improved object inclusion and linguistic binding, but still face challenges such as attribute misbinding and a lack of robust generalization guarantees. Leveraging the PAC-Bayes framework, we propose a Bayesian approach that designs custom priors over attention distributions to enforce desirable properties, including divergence between objects, alignment between modifiers and their corresponding nouns, minimal attention to irrelevant tokens, and regularization for better generalization. Our approach treats the attention mechanism as an interpretable component, enabling fine-grained control and improved attribute-object alignment. We demonstrate the effectiveness of our method on standard benchmarks, achieving state-of-the-art results across multiple metrics. By integrating custom priors into the denoising process, our method enhances image quality and addresses long-standing challenges in T2I diffusion models, paving the way for more reliable and interpretable generative models.

Via

Access Paper or Ask Questions

DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

Oct 15, 2024

Eric Hanchen Jiang, Zhi Zhang, Dinghuai Zhang, Andrew Lizarraga, Chenheng Xu, Yasi Zhang, Siyan Zhao, Zhengjie Xu, Peiyu Yu, Yuer Tang(+2 more)

Figure 1 for DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

Figure 2 for DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

Figure 3 for DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

Figure 4 for DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

Abstract:Advancements in reinforcement learning have led to the development of sophisticated models capable of learning complex decision-making tasks. However, efficiently integrating world models with decision transformers remains a challenge. In this paper, we introduce a novel approach that combines the Dreamer algorithm's ability to generate anticipatory trajectories with the adaptive learning strengths of the Online Decision Transformer. Our methodology enables parallel training where Dreamer-produced trajectories enhance the contextual decision-making of the transformer, creating a bidirectional enhancement loop. We empirically demonstrate the efficacy of our approach on a suite of challenging benchmarks, achieving notable improvements in sample efficiency and reward maximization over existing methods. Our results indicate that the proposed integrated framework not only accelerates learning but also showcases robustness in diverse and dynamic scenarios, marking a significant step forward in model-based reinforcement learning.

Via

Access Paper or Ask Questions

Latent Plan Transformer: Planning as Latent Variable Inference

Feb 07, 2024

Deqian Kong, Dehong Xu, Minglu Zhao, Bo Pang, Jianwen Xie, Andrew Lizarraga, Yuhao Huang, Sirui Xie, Ying Nian Wu

Figure 1 for Latent Plan Transformer: Planning as Latent Variable Inference

Figure 2 for Latent Plan Transformer: Planning as Latent Variable Inference

Figure 3 for Latent Plan Transformer: Planning as Latent Variable Inference

Figure 4 for Latent Plan Transformer: Planning as Latent Variable Inference

Abstract:In tasks aiming for long-term returns, planning becomes necessary. We study generative modeling for planning with datasets repurposed from offline reinforcement learning. Specifically, we identify temporal consistency in the absence of step-wise rewards as one key technical challenge. We introduce the Latent Plan Transformer (LPT), a novel model that leverages a latent space to connect a Transformer-based trajectory generator and the final return. LPT can be learned with maximum likelihood estimation on trajectory-return pairs. In learning, posterior sampling of the latent variable naturally gathers sub-trajectories to form a consistent abstraction despite the finite context. During test time, the latent variable is inferred from an expected return before policy execution, realizing the idea of planning as inference. It then guides the autoregressive policy throughout the episode, functioning as a plan. Our experiments demonstrate that LPT can discover improved decisions from suboptimal trajectories. It achieves competitive performance across several benchmarks, including Gym-Mujoco, Maze2D, and Connect Four, exhibiting capabilities of nuanced credit assignments, trajectory stitching, and adaptation to environmental contingencies. These results validate that latent variable inference can be a strong alternative to step-wise reward prompting.

Via

Access Paper or Ask Questions

SDSRA: A Skill-Driven Skill-Recombination Algorithm for Efficient Policy Learning

Dec 06, 2023

Eric H. Jiang, Andrew Lizarraga

Abstract:In this paper, we introduce a novel algorithm - the Skill-Driven Skill Recombination Algorithm (SDSRA) - an innovative framework that significantly enhances the efficiency of achieving maximum entropy in reinforcement learning tasks. We find that SDSRA achieves faster convergence compared to the traditional Soft Actor-Critic (SAC) algorithm and produces improved policies. By integrating skill-based strategies within the robust Actor-Critic framework, SDSRA demonstrates remarkable adaptability and performance across a wide array of complex and diverse benchmarks.

Via

Access Paper or Ask Questions

Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

Nov 18, 2023

Andrew Lizarraga, Brandon Taraku, Edouardo Honig, Ying Nian Wu, Shantanu H. Joshi

Figure 1 for Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

Figure 2 for Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

Figure 3 for Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

Figure 4 for Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

Abstract:Given the complex geometry of white matter streamlines, Autoencoders have been proposed as a dimension-reduction tool to simplify the analysis streamlines in a low-dimensional latent spaces. However, despite these recent successes, the majority of encoder architectures only perform dimension reduction on single streamlines as opposed to a full bundle of streamlines. This is a severe limitation of the encoder architecture that completely disregards the global geometric structure of streamlines at the expense of individual fibers. Moreover, the latent space may not be well structured which leads to doubt into their interpretability. In this paper we propose a novel Differentiable Vector Quantized Variational Autoencoder, which are engineered to ingest entire bundles of streamlines as single data-point and provides reliable trustworthy encodings that can then be later used to analyze streamlines in the latent space. Comparisons with several state of the art Autoencoders demonstrate superior performance in both encoding and synthesis.

* 5 pages, 4 figures, 1 table

Via

Access Paper or Ask Questions

StreamNet: A WAE for White Matter Streamline Analysis

Sep 03, 2022

Andrew Lizarraga, Katherine L. Narr, Kristy A. Donald, Shantanu H. Joshi

Figure 1 for StreamNet: A WAE for White Matter Streamline Analysis

Figure 2 for StreamNet: A WAE for White Matter Streamline Analysis

Figure 3 for StreamNet: A WAE for White Matter Streamline Analysis

Figure 4 for StreamNet: A WAE for White Matter Streamline Analysis

Abstract:We present StreamNet, an autoencoder architecture for the analysis of the highly heterogeneous geometry of large collections of white matter streamlines. This proposed framework takes advantage of geometry-preserving properties of the Wasserstein-1 metric in order to achieve direct encoding and reconstruction of entire bundles of streamlines. We show that the model not only accurately captures the distributive structures of streamlines in the population, but is also able to achieve superior reconstruction performance between real and synthetic streamlines. Experimental model performance is evaluated on white matter streamlines resulting from T1-weighted diffusion imaging of 40 healthy controls using recent state of the art bundle comparison metric that measures fiber-shape similarities.

Via

Access Paper or Ask Questions

Alignment of Tractography Streamlines using Deformation Transfer via Parallel Transport

Aug 08, 2021

Andrew Lizarraga, David Lee, Antoni Kubicki, Ashish Sahib, Elvis Nunez, Katherine Narr, Shantanu H. Joshi

Figure 1 for Alignment of Tractography Streamlines using Deformation Transfer via Parallel Transport

Figure 2 for Alignment of Tractography Streamlines using Deformation Transfer via Parallel Transport

Figure 3 for Alignment of Tractography Streamlines using Deformation Transfer via Parallel Transport

Figure 4 for Alignment of Tractography Streamlines using Deformation Transfer via Parallel Transport

Abstract:We present a geometric framework for aligning white matter fiber tracts. By registering fiber tracts between brains, one expects to see overlap of anatomical structures that often provide meaningful comparisons across subjects. However, the geometry of white matter tracts is highly heterogeneous, and finding direct tract-correspondence across multiple individuals remains a challenging problem. We present a novel deformation metric between tracts that allows one to compare tracts while simultaneously obtaining a registration. To accomplish this, fiber tracts are represented by an intrinsic mean along with the deformation fields represented by tangent vectors from the mean. In this setting, one can determine a parallel transport between tracts and then register corresponding tangent vectors. We present the results of bundle alignment on a population of 43 healthy adult subjects.

Via

Access Paper or Ask Questions