Picture for Junge Zhang

Junge Zhang

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

Add code
Feb 18, 2025
Viaarxiv icon

IDEA-Bench: How Far are Generative Models from Professional Designing?

Add code
Dec 16, 2024
Figure 1 for IDEA-Bench: How Far are Generative Models from Professional Designing?
Figure 2 for IDEA-Bench: How Far are Generative Models from Professional Designing?
Figure 3 for IDEA-Bench: How Far are Generative Models from Professional Designing?
Figure 4 for IDEA-Bench: How Far are Generative Models from Professional Designing?
Viaarxiv icon

Rethinking Generalizability and Discriminability of Self-Supervised Learning from Evolutionary Game Theory Perspective

Add code
Nov 30, 2024
Figure 1 for Rethinking Generalizability and Discriminability of Self-Supervised Learning from Evolutionary Game Theory Perspective
Figure 2 for Rethinking Generalizability and Discriminability of Self-Supervised Learning from Evolutionary Game Theory Perspective
Figure 3 for Rethinking Generalizability and Discriminability of Self-Supervised Learning from Evolutionary Game Theory Perspective
Figure 4 for Rethinking Generalizability and Discriminability of Self-Supervised Learning from Evolutionary Game Theory Perspective
Viaarxiv icon

Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown

Add code
Oct 01, 2024
Viaarxiv icon

Recent Advances in Attack and Defense Approaches of Large Language Models

Add code
Sep 05, 2024
Viaarxiv icon

Position: Foundation Agents as the Paradigm Shift for Decision Making

Add code
May 29, 2024
Viaarxiv icon

SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling

Add code
May 21, 2024
Figure 1 for SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Figure 2 for SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Figure 3 for SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Figure 4 for SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Viaarxiv icon

Urban Scene Diffusion through Semantic Occupancy Map

Add code
Mar 19, 2024
Viaarxiv icon

S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation

Add code
Feb 03, 2024
Figure 1 for S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation
Figure 2 for S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation
Figure 3 for S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation
Figure 4 for S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation
Viaarxiv icon

Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models

Add code
Jan 15, 2024
Figure 1 for Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models
Figure 2 for Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models
Figure 3 for Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models
Figure 4 for Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models
Viaarxiv icon