Picture for Wei Xiong

Wei Xiong

From Lists to Emojis: How Format Bias Affects Model Alignment

Add code
Sep 18, 2024
Viaarxiv icon

Semantics Preserving Emoji Recommendation with Large Language Models

Add code
Sep 16, 2024
Viaarxiv icon

GroundingBooth: Grounding Text-to-Image Customization

Add code
Sep 13, 2024
Viaarxiv icon

Building Math Agents with Multi-Turn Iterative Preference Learning

Add code
Sep 04, 2024
Figure 1 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 2 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 3 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 4 for Building Math Agents with Multi-Turn Iterative Preference Learning
Viaarxiv icon

WAS: Dataset and Methods for Artistic Text Segmentation

Add code
Jul 31, 2024
Viaarxiv icon

Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts

Add code
Jun 18, 2024
Figure 1 for Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts
Figure 2 for Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts
Viaarxiv icon

BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics

Add code
May 27, 2024
Figure 1 for BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Figure 2 for BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Figure 3 for BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Figure 4 for BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Viaarxiv icon

RLHF Workflow: From Reward Modeling to Online RLHF

Add code
May 13, 2024
Figure 1 for RLHF Workflow: From Reward Modeling to Online RLHF
Figure 2 for RLHF Workflow: From Reward Modeling to Online RLHF
Figure 3 for RLHF Workflow: From Reward Modeling to Online RLHF
Figure 4 for RLHF Workflow: From Reward Modeling to Online RLHF
Viaarxiv icon

DPO Meets PPO: Reinforced Token Optimization for RLHF

Add code
Apr 29, 2024
Figure 1 for DPO Meets PPO: Reinforced Token Optimization for RLHF
Figure 2 for DPO Meets PPO: Reinforced Token Optimization for RLHF
Figure 3 for DPO Meets PPO: Reinforced Token Optimization for RLHF
Figure 4 for DPO Meets PPO: Reinforced Token Optimization for RLHF
Viaarxiv icon

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing

Add code
Apr 08, 2024
Figure 1 for SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Figure 2 for SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Figure 3 for SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Figure 4 for SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Viaarxiv icon