Picture for Hongning Wang

Hongning Wang

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Add code
Mar 26, 2025
Viaarxiv icon

Intelligence Test

Add code
Feb 26, 2025
Viaarxiv icon

LongSafety: Evaluating Long-Context Safety of Large Language Models

Add code
Feb 24, 2025
Viaarxiv icon

AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement

Add code
Feb 24, 2025
Viaarxiv icon

HPSS: Heuristic Prompting Strategy Search for LLM Evaluators

Add code
Feb 18, 2025
Viaarxiv icon

Parametric Retrieval Augmented Generation

Add code
Jan 27, 2025
Viaarxiv icon

MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science

Add code
Jan 18, 2025
Figure 1 for MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
Figure 2 for MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
Figure 3 for MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
Figure 4 for MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
Viaarxiv icon

Agent-SafetyBench: Evaluating the Safety of LLM Agents

Add code
Dec 19, 2024
Viaarxiv icon

CharacterBench: Benchmarking Character Customization of Large Language Models

Add code
Dec 16, 2024
Figure 1 for CharacterBench: Benchmarking Character Customization of Large Language Models
Figure 2 for CharacterBench: Benchmarking Character Customization of Large Language Models
Figure 3 for CharacterBench: Benchmarking Character Customization of Large Language Models
Figure 4 for CharacterBench: Benchmarking Character Customization of Large Language Models
Viaarxiv icon

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Add code
Dec 16, 2024
Viaarxiv icon