Picture for Kaijie Zhu

Kaijie Zhu

AgentReview: Exploring Peer Review Dynamics with LLM Agents

Add code
Jun 18, 2024
Viaarxiv icon

Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities

Add code
Jun 04, 2024
Viaarxiv icon

NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models

Add code
Mar 05, 2024
Viaarxiv icon

DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents

Add code
Feb 21, 2024
Figure 1 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 2 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 3 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 4 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Viaarxiv icon

The Good, The Bad, and Why: Unveiling Emotions in Generative AI

Add code
Dec 19, 2023
Viaarxiv icon

PromptBench: A Unified Library for Evaluation of Large Language Models

Add code
Dec 13, 2023
Viaarxiv icon

CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents

Add code
Oct 26, 2023
Viaarxiv icon

DyVal: Graph-informed Dynamic Evaluation of Large Language Models

Add code
Oct 05, 2023
Viaarxiv icon

Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Add code
Aug 01, 2023
Viaarxiv icon

EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus

Add code
Aug 01, 2023
Viaarxiv icon