Picture for Yaodong Yang

Yaodong Yang

Model Evolution Framework with Genetic Algorithm for Multi-Task Reinforcement Learning

Add code
Feb 19, 2025
Viaarxiv icon

Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer

Add code
Feb 04, 2025
Viaarxiv icon

RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems?

Add code
Jan 20, 2025
Viaarxiv icon

Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction

Add code
Jan 09, 2025
Viaarxiv icon

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Add code
Dec 24, 2024
Figure 1 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 2 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 3 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 4 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Viaarxiv icon

Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback

Add code
Dec 20, 2024
Figure 1 for Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
Figure 2 for Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
Figure 3 for Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
Figure 4 for Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
Viaarxiv icon

Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation

Add code
Dec 15, 2024
Viaarxiv icon

RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors

Add code
Dec 14, 2024
Viaarxiv icon

Random Feature Models with Learnable Activation Functions

Add code
Nov 29, 2024
Viaarxiv icon

Object-Centric Dexterous Manipulation from Human Motion Data

Add code
Nov 06, 2024
Figure 1 for Object-Centric Dexterous Manipulation from Human Motion Data
Figure 2 for Object-Centric Dexterous Manipulation from Human Motion Data
Figure 3 for Object-Centric Dexterous Manipulation from Human Motion Data
Figure 4 for Object-Centric Dexterous Manipulation from Human Motion Data
Viaarxiv icon