Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oliver Hausdörfer

Communication Compression for Tensor Parallel LLM Inference

Nov 14, 2024

Jan Hansen-Palmus, Michael Truong-Le, Oliver Hausdörfer, Alok Verma

Abstract:Large Language Models (LLMs) have pushed the frontier of artificial intelligence but are comprised of hundreds of billions of parameters and operations. For faster inference latency, LLMs are deployed on multiple hardware accelerators through various Model Parallelism strategies. Our paper looks into the details on one such strategy - Tensor Parallel - and proposes to reduce latency by compressing inter-accelerator communication. We leverage fine grained quantization techniques to compress selected activations by 3.5 - 4.5x. Our proposed method leads up to 2x reduction of time-to-first-token (TTFT) with negligible model performance degradation.

Via

Access Paper or Ask Questions

Latent Action Priors From a Single Gait Cycle Demonstration for Online Imitation Learning

Oct 04, 2024

Oliver Hausdörfer, Alexander von Rohr, Éric Lefort, Angela Schoellig

Figure 1 for Latent Action Priors From a Single Gait Cycle Demonstration for Online Imitation Learning

Figure 2 for Latent Action Priors From a Single Gait Cycle Demonstration for Online Imitation Learning

Figure 3 for Latent Action Priors From a Single Gait Cycle Demonstration for Online Imitation Learning

Figure 4 for Latent Action Priors From a Single Gait Cycle Demonstration for Online Imitation Learning

Abstract:Deep Reinforcement Learning (DRL) in simulation often results in brittle and unrealistic learning outcomes. To push the agent towards more desirable solutions, prior information can be injected in the learning process through, for instance, reward shaping, expert data, or motion primitives. We propose an additional inductive bias for robot learning: latent actions learned from expert demonstration as priors in the action space. We show that these action priors can be learned from only a single open-loop gait cycle using a simple autoencoder. Using these latent action priors combined with established style rewards for imitation in DRL achieves above expert demonstration level of performance and leads to more desirable gaits. Further, action priors substantially improve the performance on transfer tasks, even leading to gait transitions for higher target speeds. Videos and code are available at https://sites.google.com/view/latent-action-priors.

* Submitted to ICRA 2025

Via

Access Paper or Ask Questions