Picture for Jiale Cheng

Jiale Cheng

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Add code
Dec 30, 2024
Viaarxiv icon

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Add code
Dec 16, 2024
Viaarxiv icon

LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models

Add code
Sep 05, 2024
Figure 1 for LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Figure 2 for LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Figure 3 for LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Figure 4 for LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Viaarxiv icon

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Add code
Jun 24, 2024
Figure 1 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 2 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 3 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 4 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Viaarxiv icon

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Add code
Jun 18, 2024
Figure 1 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 2 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 3 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 4 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Viaarxiv icon

AlignBench: Benchmarking Chinese Alignment of Large Language Models

Add code
Dec 05, 2023
Viaarxiv icon

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

Add code
Nov 30, 2023
Viaarxiv icon

Black-Box Prompt Optimization: Aligning Large Language Models without Model Training

Add code
Nov 08, 2023
Viaarxiv icon

Safety Assessment of Chinese Large Language Models

Add code
Apr 20, 2023
Viaarxiv icon

Recent Advances towards Safe, Responsible, and Moral Dialogue Systems: A Survey

Add code
Feb 18, 2023
Viaarxiv icon