Picture for Zhisheng Tang

Zhisheng Tang

GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning

Add code
Jul 02, 2024
Viaarxiv icon

Is persona enough for personality? Using ChatGPT to reconstruct an agent's latent personality from simple descriptions

Add code
Jun 18, 2024
Viaarxiv icon

An Evaluation of Estimative Uncertainty in Large Language Models

Add code
May 24, 2024
Viaarxiv icon

A Pilot Evaluation of ChatGPT and DALL-E 2 on Decision Making and Spatial Reasoning

Add code
Feb 15, 2023
Viaarxiv icon

Can Language Representation Models Think in Bets?

Add code
Oct 14, 2022
Figure 1 for Can Language Representation Models Think in Bets?
Figure 2 for Can Language Representation Models Think in Bets?
Figure 3 for Can Language Representation Models Think in Bets?
Figure 4 for Can Language Representation Models Think in Bets?
Viaarxiv icon