Picture for Huazheng Wang

Huazheng Wang

Eugene

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning

Add code
Oct 31, 2024
Viaarxiv icon

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Add code
Oct 18, 2024
Viaarxiv icon

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Add code
Oct 17, 2024
Viaarxiv icon

Conversational Dueling Bandits in Generalized Linear Models

Add code
Jul 26, 2024
Viaarxiv icon

Contractual Reinforcement Learning: Pulling Arms with Invisible Hands

Add code
Jul 02, 2024
Viaarxiv icon

LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking

Add code
May 31, 2024
Viaarxiv icon

FCOM: A Federated Collaborative Online Monitoring Framework via Representation Learning

Add code
May 30, 2024
Viaarxiv icon

Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search

Add code
May 09, 2024
Viaarxiv icon

Embodied LLM Agents Learn to Cooperate in Organized Teams

Add code
Mar 19, 2024
Viaarxiv icon

AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks

Add code
Mar 02, 2024
Viaarxiv icon