Picture for Jiale Cheng

Jiale Cheng

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Add code
Mar 26, 2025
Viaarxiv icon

LongSafety: Evaluating Long-Context Safety of Large Language Models

Add code
Feb 24, 2025
Viaarxiv icon

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Add code
Dec 30, 2024
Viaarxiv icon

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Add code
Dec 16, 2024
Viaarxiv icon

LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models

Add code
Sep 05, 2024
Figure 1 for LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Figure 2 for LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Figure 3 for LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Figure 4 for LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Viaarxiv icon

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Add code
Jun 24, 2024
Figure 1 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 2 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 3 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 4 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Viaarxiv icon

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Add code
Jun 18, 2024
Figure 1 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 2 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 3 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 4 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Viaarxiv icon

AlignBench: Benchmarking Chinese Alignment of Large Language Models

Add code
Dec 05, 2023
Viaarxiv icon

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

Add code
Nov 30, 2023
Viaarxiv icon

Black-Box Prompt Optimization: Aligning Large Language Models without Model Training

Add code
Nov 08, 2023
Viaarxiv icon