Picture for Nuoya Xiong

Nuoya Xiong

Cooperative Multi-agent RL with Communication Constraints

Add code
Jan 18, 2026
Viaarxiv icon

Token-Level LLM Collaboration via FusionRoute

Add code
Jan 08, 2026
Viaarxiv icon

Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF

Add code
Feb 24, 2025
Figure 1 for Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF
Figure 2 for Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF
Figure 3 for Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF
Figure 4 for Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF
Viaarxiv icon

A Correction of Pseudo Log-Likelihood Method

Add code
Mar 26, 2024
Viaarxiv icon

Sample-Efficient Multi-Agent RL: An Optimization Perspective

Add code
Oct 10, 2023
Viaarxiv icon

How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization

Add code
Oct 09, 2023
Figure 1 for How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Figure 2 for How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Figure 3 for How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Viaarxiv icon

A General Framework for Sequential Decision-Making under Adaptivity Constraints

Add code
Jun 27, 2023
Figure 1 for A General Framework for Sequential Decision-Making under Adaptivity Constraints
Figure 2 for A General Framework for Sequential Decision-Making under Adaptivity Constraints
Viaarxiv icon

Provably Safe Reinforcement Learning with Step-wise Violation Constraints

Add code
Feb 13, 2023
Figure 1 for Provably Safe Reinforcement Learning with Step-wise Violation Constraints
Figure 2 for Provably Safe Reinforcement Learning with Step-wise Violation Constraints
Figure 3 for Provably Safe Reinforcement Learning with Step-wise Violation Constraints
Figure 4 for Provably Safe Reinforcement Learning with Step-wise Violation Constraints
Viaarxiv icon

Combinatorial Causal Bandits without Graph Skeleton

Add code
Jan 31, 2023
Figure 1 for Combinatorial Causal Bandits without Graph Skeleton
Figure 2 for Combinatorial Causal Bandits without Graph Skeleton
Figure 3 for Combinatorial Causal Bandits without Graph Skeleton
Figure 4 for Combinatorial Causal Bandits without Graph Skeleton
Viaarxiv icon

Pure Exploration of Causal Bandits

Add code
Jun 16, 2022
Figure 1 for Pure Exploration of Causal Bandits
Figure 2 for Pure Exploration of Causal Bandits
Viaarxiv icon