Picture for Wenyue Hua

Wenyue Hua

RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios

Add code
Dec 12, 2024
Viaarxiv icon

Disentangling Memory and Reasoning Ability in Large Language Models

Add code
Nov 21, 2024
Viaarxiv icon

Game-theoretic LLM: Agent Workflow for Negotiation Games

Add code
Nov 12, 2024
Viaarxiv icon

AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow

Add code
Sep 27, 2024
Figure 1 for AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
Figure 2 for AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
Figure 3 for AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
Figure 4 for AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
Viaarxiv icon

Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models

Add code
Jul 15, 2024
Viaarxiv icon

MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate

Add code
Jun 26, 2024
Viaarxiv icon

MoralBench: Moral Evaluation of LLMs

Add code
Jun 06, 2024
Figure 1 for MoralBench: Moral Evaluation of LLMs
Figure 2 for MoralBench: Moral Evaluation of LLMs
Figure 3 for MoralBench: Moral Evaluation of LLMs
Figure 4 for MoralBench: Moral Evaluation of LLMs
Viaarxiv icon

Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities

Add code
Jun 04, 2024
Viaarxiv icon

BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis

Add code
Apr 23, 2024
Viaarxiv icon

Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

Add code
Apr 10, 2024
Viaarxiv icon