Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pei Xu

WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models

Jun 12, 2025

Qiyue Yin, Pei Xu, Qiaozhe Li, Shengda Liu, Shengqi Shen, Tong Wang, Yihong Han, Xiaonan Zhao, Likun Yang, Shiyue Cao(+8 more)

Abstract:Recent breakthroughs in Large Language Models (LLMs) have led to a qualitative leap in artificial intelligence' s performance on reasoning tasks, particularly demonstrating remarkable capabilities in mathematical, symbolic, and commonsense reasoning. However, as a critical component of advanced human cognition, strategic reasoning, i.e., the ability to assess multi-agent behaviors in dynamic environments, formulate action plans, and adapt strategies, has yet to be systematically evaluated or modeled. To address this gap, this paper introduces WGSR-Bench, the first strategy reasoning benchmark for LLMs using wargame as its evaluation environment. Wargame, a quintessential high-complexity strategic scenario, integrates environmental uncertainty, adversarial dynamics, and non-unique strategic choices, making it an effective testbed for assessing LLMs' capabilities in multi-agent decision-making, intent inference, and counterfactual reasoning. WGSR-Bench designs test samples around three core tasks, i.e., Environmental situation awareness, Opponent risk modeling and Policy generation, which serve as the core S-POE architecture, to systematically assess main abilities of strategic reasoning. Finally, an LLM-based wargame agent is designed to integrate these parts for a comprehensive strategy reasoning assessment. With WGSR-Bench, we hope to assess the strengths and limitations of state-of-the-art LLMs in game-theoretic strategic reasoning and to advance research in large model-driven strategic intelligence.

* 15 pages, 17 figures

Via

Access Paper or Ask Questions

FürElise: Capturing and Physically Synthesizing Hand Motions of Piano Performance

Oct 08, 2024

Ruocheng Wang, Pei Xu, Haochen Shi, Elizabeth Schumann, C. Karen Liu

Abstract:Piano playing requires agile, precise, and coordinated hand control that stretches the limits of dexterity. Hand motion models with the sophistication to accurately recreate piano playing have a wide range of applications in character animation, embodied AI, biomechanics, and VR/AR. In this paper, we construct a first-of-its-kind large-scale dataset that contains approximately 10 hours of 3D hand motion and audio from 15 elite-level pianists playing 153 pieces of classical music. To capture natural performances, we designed a markerless setup in which motions are reconstructed from multi-view videos using state-of-the-art pose estimation models. The motion data is further refined via inverse kinematics using the high-resolution MIDI key-pressing data obtained from sensors in a specialized Yamaha Disklavier piano. Leveraging the collected dataset, we developed a pipeline that can synthesize physically-plausible hand motions for musical scores outside of the dataset. Our approach employs a combination of imitation learning and reinforcement learning to obtain policies for physics-based bimanual control involving the interaction between hands and piano keys. To solve the sampling efficiency problem with the large motion dataset, we use a diffusion model to generate natural reference motions, which provide high-level trajectory and fingering (finger order and placement) information. However, the generated reference motion alone does not provide sufficient accuracy for piano performance modeling. We then further augmented the data by using musical similarity to retrieve similar motions from the captured dataset to boost the precision of the RL policy. With the proposed method, our model generates natural, dexterous motions that generalize to music from outside the training dataset.

* SIGGRAPH Asia 2024. Project page: https://for-elise.github.io/

Via

Access Paper or Ask Questions

AdaptNet: Policy Adaptation for Physics-Based Character Control

Oct 09, 2023

Pei Xu, Kaixiang Xie, Sheldon Andrews, Paul G. Kry, Michael Neff, Morgan McGuire, Ioannis Karamouzas, Victor Zordan

Figure 1 for AdaptNet: Policy Adaptation for Physics-Based Character Control

Figure 2 for AdaptNet: Policy Adaptation for Physics-Based Character Control

Figure 3 for AdaptNet: Policy Adaptation for Physics-Based Character Control

Figure 4 for AdaptNet: Policy Adaptation for Physics-Based Character Control

Abstract:Motivated by humans' ability to adapt skills in the learning of new ones, this paper presents AdaptNet, an approach for modifying the latent space of existing policies to allow new behaviors to be quickly learned from like tasks in comparison to learning from scratch. Building on top of a given reinforcement learning controller, AdaptNet uses a two-tier hierarchy that augments the original state embedding to support modest changes in a behavior and further modifies the policy network layers to make more substantive changes. The technique is shown to be effective for adapting existing physics-based controllers to a wide range of new styles for locomotion, new task targets, changes in character morphology and extensive changes in environment. Furthermore, it exhibits significant increase in learning efficiency, as indicated by greatly reduced training times when compared to training from scratch or using other approaches that modify existing policies. Code is available at https://motion-lab.github.io/AdaptNet.

* ACM Transactions on Graphics 42, 6, Article 112.1522 (December 2023)
* SIGGRAPH Asia 2023. Video: https://youtu.be/WxmJSCNFb28. Website: https://motion-lab.github.io/AdaptNet, https://pei-xu.github.io/AdaptNet

Via

Access Paper or Ask Questions

Composite Motion Learning with Task Control

May 05, 2023

Pei Xu, Xiumin Shang, Victor Zordan, Ioannis Karamouzas

Figure 1 for Composite Motion Learning with Task Control

Figure 2 for Composite Motion Learning with Task Control

Figure 3 for Composite Motion Learning with Task Control

Figure 4 for Composite Motion Learning with Task Control

Abstract:We present a deep learning method for composite and task-driven motion control for physically simulated characters. In contrast to existing data-driven approaches using reinforcement learning that imitate full-body motions, we learn decoupled motions for specific body parts from multiple reference motions simultaneously and directly by leveraging the use of multiple discriminators in a GAN-like setup. In this process, there is no need of any manual work to produce composite reference motions for learning. Instead, the control policy explores by itself how the composite motions can be combined automatically. We further account for multiple task-specific rewards and train a single, multi-objective control policy. To this end, we propose a novel framework for multi-objective learning that adaptively balances the learning of disparate motions from multiple sources and multiple goal-directed control objectives. In addition, as composite motions are typically augmentations of simpler behaviors, we introduce a sample-efficient method for training composite control policies in an incremental manner, where we reuse a pre-trained policy as the meta policy and train a cooperative policy that adapts the meta one for new composite tasks. We show the applicability of our approach on a variety of challenging multi-objective tasks involving both composite motion imitation and multiple goal-directed control.

* ACM Transactions on Graphics (August 2023)
* SIGGRAPH 2023. Code: https://github.com/xupei0610/CompositeMotion. Video: https://youtu.be/mcRAxwoTh3E

Via

Access Paper or Ask Questions

Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction

Feb 21, 2023

Pei Xu, Jean-Bernard Hayet, Ioannis Karamouzas

Abstract:Real-time, accurate prediction of human steering behaviors has wide applications, from developing intelligent traffic systems to deploying autonomous driving systems in both real and simulated worlds. In this paper, we present ContextVAE, a context-aware approach for multi-modal vehicle trajectory prediction. Built upon the backbone architecture of a timewise variational autoencoder, ContextVAE employs a dual attention mechanism for observation encoding that accounts for the environmental context information and the dynamic agents' states in a unified way. By utilizing features extracted from semantic maps during agent state encoding, our approach takes into account both the social features exhibited by agents on the scene and the physical environment constraints to generate map-compliant and socially-aware trajectories. We perform extensive testing on the nuScenes prediction challenge, Lyft Level 5 dataset and Waymo Open Motion Dataset to show the effectiveness of our approach and its state-of-the-art performance. In all tested datasets, ContextVAE models are fast to train and provide high-quality multi-modal predictions in real-time.

Via

Access Paper or Ask Questions

SocialVAE: Human Trajectory Prediction using Timewise Latents

Mar 29, 2022

Pei Xu, Jean-Bernard Hayet, Ioannis Karamouzas

Figure 1 for SocialVAE: Human Trajectory Prediction using Timewise Latents

Figure 2 for SocialVAE: Human Trajectory Prediction using Timewise Latents

Figure 3 for SocialVAE: Human Trajectory Prediction using Timewise Latents

Figure 4 for SocialVAE: Human Trajectory Prediction using Timewise Latents

Abstract:Predicting pedestrian movement is critical for human behavior analysis and also for safe and efficient human-agent interactions. However, despite significant advancements, it is still challenging for existing approaches to capture the uncertainty and multimodality of human navigation decision making. In this paper, we propose SocialVAE, a novel approach for human trajectory prediction. The core of SocialVAE is a timewise variational autoencoder architecture that exploits stochastic recurrent neural networks to perform prediction, combined with a social attention mechanism and backward posterior approximation to allow for better extraction of pedestrian navigation strategies. We show that SocialVAE improves current state-of-the-art performance on several pedestrian trajectory prediction benchmarks, including the ETH/UCY benchmark, the Stanford Drone Dataset and SportVU NBA movement dataset. Code is available at: https://github.com/xupei0610/SocialVAE.

Via

Access Paper or Ask Questions

A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

May 21, 2021

Pei Xu, Ioannis Karamouzas

Figure 1 for A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

Figure 2 for A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

Figure 3 for A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

Figure 4 for A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

Abstract:We present a simple and intuitive approach for interactive control of physically simulated characters. Our work builds upon generative adversarial networks (GAN) and reinforcement learning, and introduces an imitation learning framework where an ensemble of classifiers and an imitation policy are trained in tandem given pre-processed reference clips. The classifiers are trained to discriminate the reference motion from the motion generated by the imitation policy, while the policy is rewarded for fooling the discriminators. Using our GAN-based approach, multiple motor control policies can be trained separately to imitate different behaviors. In runtime, our system can respond to external control signal provided by the user and interactively switch between different policies. Compared to existing methods, our proposed approach has the following attractive properties: 1) achieves state-of-the-art imitation performance without manually designing and fine tuning a reward function; 2) directly controls the character without having to track any target reference pose explicitly or implicitly through a phase state; and 3) supports interactive policy switching without requiring any motion generation or motion matching mechanism. We highlight the applicability of our approach in a range of imitation and interactive control tasks, while also demonstrating its ability to withstand external perturbations as well as to recover balance. Overall, our approach generates high-fidelity motion, has low runtime cost, and can be easily integrated into interactive applications and games.

Via

Access Paper or Ask Questions

Human-Inspired Multi-Agent Navigation using Knowledge Distillation

Mar 20, 2021

Pei Xu, Ioannis Karamouzas

Figure 1 for Human-Inspired Multi-Agent Navigation using Knowledge Distillation

Figure 2 for Human-Inspired Multi-Agent Navigation using Knowledge Distillation

Figure 3 for Human-Inspired Multi-Agent Navigation using Knowledge Distillation

Figure 4 for Human-Inspired Multi-Agent Navigation using Knowledge Distillation

Abstract:Despite significant advancements in the field of multi-agent navigation, agents still lack the sophistication and intelligence that humans exhibit in multi-agent settings. In this paper, we propose a framework for learning a human-like general collision avoidance policy for agent-agent interactions in fully decentralized, multi-agent environments. Our approach uses knowledge distillation with reinforcement learning to shape the reward function based on expert policies extracted from human trajectory demonstrations through behavior cloning. We show that agents trained with our approach can take human-like trajectories in collision avoidance and goal-directed steering tasks not provided by the demonstrations, outperforming the experts as well as learning-based agents trained without knowledge distillation.

Via

Access Paper or Ask Questions

Particle-Based Adaptive Discretization for Continuous Control using Deep Reinforcement Learning

Mar 16, 2020

Pei Xu, Ioannis Karamouzas

Figure 1 for Particle-Based Adaptive Discretization for Continuous Control using Deep Reinforcement Learning

Figure 2 for Particle-Based Adaptive Discretization for Continuous Control using Deep Reinforcement Learning

Figure 3 for Particle-Based Adaptive Discretization for Continuous Control using Deep Reinforcement Learning

Figure 4 for Particle-Based Adaptive Discretization for Continuous Control using Deep Reinforcement Learning

Abstract:Learning controls in high-dimensional continuous action spaces, such as controlling the movements of highly articulated agents and robots, has long been a standing challenge to model-free deep reinforcement learning (DRL). In this paper we propose a general, yet simple, framework for improving the action exploration of policy gradient DRL algorithms. Our approach adapts ideas from the particle filtering literature to dynamically discretize the continuous action space and track policies represented as a mixture of Gaussians. We demonstrate the applicability of our approach on state-of-the-art DRL baselines in challenging high-dimensional motor tasks involving articulated agents. We show that our adaptive particle-based discretization leads to improved final performance and speed of convergence as compared to uniform discretization schemes and to corresponding implementations in continuous action spaces, highlighting the importance of exploration. In addition, the resulting policies are more stable, exhibiting less variance across different training trials.

* 21 pages, 9 figures

Via

Access Paper or Ask Questions

Robot Navigation with Map-Based Deep Reinforcement Learning

Feb 11, 2020

Guangda Chen, Lifan Pan, Yu'an Chen, Pei Xu, Zhiqiang Wang, Peichen Wu, Jianmin Ji, Xiaoping Chen

Figure 1 for Robot Navigation with Map-Based Deep Reinforcement Learning

Figure 2 for Robot Navigation with Map-Based Deep Reinforcement Learning

Figure 3 for Robot Navigation with Map-Based Deep Reinforcement Learning

Figure 4 for Robot Navigation with Map-Based Deep Reinforcement Learning

Abstract:This paper proposes an end-to-end deep reinforcement learning approach for mobile robot navigation with dynamic obstacles avoidance. Using experience collected in a simulation environment, a convolutional neural network (CNN) is trained to predict proper steering actions of a robot from its egocentric local occupancy maps, which accommodate various sensors and fusion algorithms. The trained neural network is then transferred and executed on a real-world mobile robot to guide its local path planning. The new approach is evaluated both qualitatively and quantitatively in simulation and real-world robot experiments. The results show that the map-based end-to-end navigation model is easy to be deployed to a robotic platform, robust to sensor noise and outperforms other existing DRL-based models in many indicators.

* 6 pages, 7 figures, accepted by ICNSC 2020

Via

Access Paper or Ask Questions