Picture for Ziyi Yang

Ziyi Yang

Weighted-Reward Preference Optimization for Implicit Model Fusion

Add code
Dec 04, 2024
Viaarxiv icon

OASIS: Open Agent Social Interaction Simulations with One Million Agents

Add code
Nov 26, 2024
Figure 1 for OASIS: Open Agent Social Interaction Simulations with One Million Agents
Figure 2 for OASIS: Open Agent Social Interaction Simulations with One Million Agents
Figure 3 for OASIS: Open Agent Social Interaction Simulations with One Million Agents
Figure 4 for OASIS: Open Agent Social Interaction Simulations with One Million Agents
Viaarxiv icon

OASIS: Open Agents Social Interaction Simulations on One Million Agents

Add code
Nov 21, 2024
Figure 1 for OASIS: Open Agents Social Interaction Simulations on One Million Agents
Figure 2 for OASIS: Open Agents Social Interaction Simulations on One Million Agents
Figure 3 for OASIS: Open Agents Social Interaction Simulations on One Million Agents
Figure 4 for OASIS: Open Agents Social Interaction Simulations on One Million Agents
Viaarxiv icon

Towards Realistic Example-based Modeling via 3D Gaussian Stitching

Add code
Aug 28, 2024
Viaarxiv icon

See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses

Add code
Aug 16, 2024
Figure 1 for See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Figure 2 for See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Figure 3 for See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Figure 4 for See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Viaarxiv icon

Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle

Add code
Jul 18, 2024
Figure 1 for Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle
Figure 2 for Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle
Figure 3 for Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle
Figure 4 for Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle
Viaarxiv icon

Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning

Add code
Jul 02, 2024
Viaarxiv icon

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Add code
May 29, 2024
Figure 1 for Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Figure 2 for Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Figure 3 for Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Figure 4 for Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Viaarxiv icon

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Add code
Apr 23, 2024
Figure 1 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 2 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 3 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 4 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Viaarxiv icon

3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting

Add code
Mar 30, 2024
Viaarxiv icon