Picture for Ruoxi Jia

Ruoxi Jia

Virginia Tech

Can Small Training Runs Reliably Guide Data Curation? Rethinking Proxy-Model Practice

Add code
Dec 30, 2025
Viaarxiv icon

More Than the Final Answer: Improving Visual Extraction and Logical Consistency in Vision-Language Models

Add code
Dec 13, 2025
Viaarxiv icon

Quagmires in SFT-RL Post-Training: When High SFT Scores Mislead and What to Use Instead

Add code
Oct 02, 2025
Viaarxiv icon

Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls

Add code
Oct 02, 2025
Viaarxiv icon

MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling

Add code
Aug 11, 2025
Viaarxiv icon

CONCORD: Concept-Informed Diffusion for Dataset Distillation

Add code
May 23, 2025
Viaarxiv icon

LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models

Add code
Apr 14, 2025
Figure 1 for LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models
Figure 2 for LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models
Figure 3 for LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models
Figure 4 for LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models
Viaarxiv icon

Strategize Globally, Adapt Locally: A Multi-Turn Red Teaming Agent with Dual-Level Learning

Add code
Apr 02, 2025
Viaarxiv icon

Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models

Add code
Mar 12, 2025
Viaarxiv icon

Optimizing Product Provenance Verification using Data Valuation Methods

Add code
Feb 21, 2025
Viaarxiv icon