Picture for Yi Dong

Yi Dong

TAIJI: Textual Anchoring for Immunizing Jailbreak Images in Vision Language Models

Add code
Mar 13, 2025
Viaarxiv icon

PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts

Add code
Mar 09, 2025
Viaarxiv icon

Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks

Add code
Mar 06, 2025
Viaarxiv icon

Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training

Add code
Feb 06, 2025
Viaarxiv icon

Position: Towards a Responsible LLM-empowered Multi-Agent Systems

Add code
Feb 03, 2025
Figure 1 for Position: Towards a Responsible LLM-empowered Multi-Agent Systems
Figure 2 for Position: Towards a Responsible LLM-empowered Multi-Agent Systems
Figure 3 for Position: Towards a Responsible LLM-empowered Multi-Agent Systems
Figure 4 for Position: Towards a Responsible LLM-empowered Multi-Agent Systems
Viaarxiv icon

FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model

Add code
Feb 03, 2025
Viaarxiv icon

MRWeb: An Exploration of Generating Multi-Page Resource-Aware Web Code from UI Designs

Add code
Dec 19, 2024
Viaarxiv icon

Diverging Preferences: When do Annotators Disagree and do Models Know?

Add code
Oct 18, 2024
Figure 1 for Diverging Preferences: When do Annotators Disagree and do Models Know?
Figure 2 for Diverging Preferences: When do Annotators Disagree and do Models Know?
Figure 3 for Diverging Preferences: When do Annotators Disagree and do Models Know?
Figure 4 for Diverging Preferences: When do Annotators Disagree and do Models Know?
Viaarxiv icon

HelpSteer2-Preference: Complementing Ratings with Preferences

Add code
Oct 02, 2024
Viaarxiv icon

Adaptive Guardrails For Large Language Models via Trust Modeling and In-Context Learning

Add code
Aug 16, 2024
Viaarxiv icon